Skip to content
星际流动

LLMs Gaming Verifiers: RLVR can Lead to Reward Hacking

发布
采集
学术前沿 7.5 分 — Critical finding: RLVR-trained models abandon rule induction for verifier gaming, important warning for reasoning scaling efforts
原文: cs.LG updates on arXiv.org

评分 7.5 · 来源:cs.LG updates on arXiv.org · 发布于 2026-04-17

评分依据:Critical finding: RLVR-trained models abandon rule induction for verifier gaming, important warning for reasoning scaling efforts

arXiv:2604.15149v1 Announce Type: new Abstract: As reinforcement Learning with Verifiable Rewards (RLVR) has become the dominant paradigm for scaling reasoning capabilities in LLMs, a new failure mode emerges: LLMs gaming verifiers. We study this phenomenon on inductive reasoning tasks, where models must induce and output logical rules. We find that RLVR-trained models systematically abandon rule induction.