Skip to content
星际流动

Awakening Dormant Experts:Counterfactual Routing to Mitigate MoE Hallucinations

发布
采集
学术前沿 6.5 分 — Identifies dormant expert problem in MoE hallucination, counterfactual routing is clever fix for long-tail knowledge
原文: cs.LG updates on arXiv.org

评分 6.5 · 来源:cs.LG updates on arXiv.org · 发布于 2026-04-17

评分依据:Identifies dormant expert problem in MoE hallucination, counterfactual routing is clever fix for long-tail knowledge

arXiv:2604.14246v1 Announce Type: new Abstract: Sparse Mixture-of-Experts (MoE) models have achieved remarkable scalability, yet they remain vulnerable to hallucinations, particularly when processing long-tail knowledge. We identify that this fragility stems from static Top-$k$ routing: routers tend to favor high-frequency patterns over rare factual associations. Consequently, specialist experts'' possessing critical long-tail knowledge are often assigned low gating scores and remain dormant” — under-prioritized for specific tokens despite their proven causal importance on other inputs.