From Hallucination to Structure Snowballing: The Alignment Tax of Constrained Decoding in LLM Reflection

发布

2026年04月08日

采集 2026年04月08日 04:31

学术前沿 6.4 分 — 有一定参考价值的AI研究论文

评分 6.4 · 来源：cs.CL updates on arXiv.org · 发布于 2026-04-08

评分依据：有一定参考价值的AI研究论文

arXiv:2604.06066v1 Announce Type: new Abstract: Intrinsic self-correction in Large Language Models (LLMs) frequently fails in open-ended reasoning tasks due to “hallucination snowballing,” a phenomenon in which models recursively justify early errors during free-text reflection. While structured feedback can mitigate this issue, existing approaches often rely on externally trained critics or symbolic tools, reducing agent autonomy. This study investigates whether enforcing structured reflection purely through Outlines-based constrained decoding can disrupt error propagation without additiona