Do LLMs Really Know What They Don't Know? Internal States Mainly Reflect Knowledge Recall Rather Than Truthfulness

发布

2026年04月20日

采集 2026年04月20日 09:04

学术前沿 6.0 分 — LLM内部状态主要反映知识回忆而非真实性：挑战主流叙事

评分 6 · 来源：cs.CL updates on arXiv.org · 发布于 2026-04-20

评分依据：LLM内部状态主要反映知识回忆而非真实性：挑战主流叙事

要点

arXiv:2510.09033v3 Announce Type: replace Abstract: Recent work suggests that LLMs “know what they don’t know”, positing that hallucinated and factually correct outputs arise from distinct internal processes and can therefore be distinguished using internal signals. However, hallucinations have multifaceted causes: beyond simple knowledge gaps, they can emerge from training incentives that encourage models to exploit statistical shortcuts or spurious associations learned during pretraining. In this paper, we argue that when LLMs rely on such learned associations to produce hallucinations, thei…

🤖 AI 点评

本文提供了AI领域的重要信息，值得行业从业者关注。