Evaluating Large Language Models' Ability to Detect Character Knowledge Errors in Role-Playing
Even the latest large language models struggle to effectively detect known knowledge errors and unknown knowledge errors when playing roles, particularly for familiar knowledge.