Are LLM Uncertainty and Correctness Encoded by the Same Features? A Functional Dissociation via Sparse Autoencoders

发布

2026年04月23日

采集 2026年04月23日 00:00

行业动态 6.5 分 — Uses SAEs to dissociate uncertainty from correctness features. Novel interpretability approach with implications for reliable AI deployment.

原文： arXiv

评分 6.5 · 来源： · 发布于

评分依据：Uses SAEs to dissociate uncertainty from correctness features. Novel interpretability approach with implications for reliable AI deployment.

Statistics, Not Scale: Modular Medical Dialogue with Bayesian Belief Engine

Continuous Semantic Caching for Low-Cost LLM Serving