Internal Knowledge Without External Expression: Probing the Generalization Boundaries of Factual Knowledge in LLMs

发布

2026年04月17日

采集 2026年04月17日 04:31

学术前沿 6.0 分 — Probes internal knowledge vs external expression boundary in LLMs, important for understanding knowledge representation

评分 6 · 来源：cs.CL updates on arXiv.org · 发布于 2026-04-17

评分依据：Probes internal knowledge vs external expression boundary in LLMs, important for understanding knowledge representation

arXiv:2604.14180v1 Announce Type: new Abstract: We train a 318M-parameter Transformer language model from scratch on a curated corpus of 1.56 billion tokens of pure Classical Chinese, with zero English characters or Arabic numerals. Through systematic out-of-distribution (OOD) testing, we investigate whether the model can distinguish known from unknown inputs, and crucially, whether it can express this distinction in its generated text. We find a clear dissociation between internal and external uncertainty.