评分 7 · 来源:cs.LG updates on arXiv.org · 发布于 2026-04-20
评分依据:SFT导致幻觉的机制分析和自蒸馏修复方案
要点
arXiv:2604.15574v1 Announce Type: cross Abstract: Large language models are prone to hallucinating factually incorrect statements. A key source of these errors is exposure to new factual information through supervised fine-tuning (SFT), which can increase hallucinations w.r.t. knowledge acquired during pre-training. In this work, we explore whether SFT-induced hallucinations can be mitigated using established tools from the continual learning literature, since they arise as a by-product of knowledge degradation during training. We propose a self-distillation-based SFT method that facilitates e…
🤖 AI 点评
本文提供了AI领域的重要信息,值得行业从业者关注。