Why Fine-Tuning Encourages Hallucinations and How to Fix It

发布

2026年04月20日

采集 2026年04月20日 09:04

学术前沿 7.0 分 — SFT导致幻觉的机制分析和自蒸馏修复方案

评分 7 · 来源：cs.LG updates on arXiv.org · 发布于 2026-04-20

评分依据：SFT导致幻觉的机制分析和自蒸馏修复方案

要点

arXiv:2604.15574v1 Announce Type: cross Abstract: Large language models are prone to hallucinating factually incorrect statements. A key source of these errors is exposure to new factual information through supervised fine-tuning (SFT), which can increase hallucinations w.r.t. knowledge acquired during pre-training. In this work, we explore whether SFT-induced hallucinations can be mitigated using established tools from the continual learning literature, since they arise as a by-product of knowledge degradation during training. We propose a self-distillation-based SFT method that facilitates e…

🤖 AI 点评

本文提供了AI领域的重要信息，值得行业从业者关注。