Pruning Long Chain-of-Thought of Large Reasoning Models via Small-Scale Preference Optimization

发布

2026年04月17日

采集 2026年04月17日 04:31

学术前沿 6.0 分 — Pruning redundant CoT reasoning via small-scale preference optimization, directly addresses reasoning efficiency

评分 6 · 来源：cs.AI updates on arXiv.org · 发布于 2026-04-17

评分依据：Pruning redundant CoT reasoning via small-scale preference optimization, directly addresses reasoning efficiency

arXiv:2508.10164v2 Announce Type: replace Abstract: Recent advances in Large Reasoning Models (LRMs) have demonstrated strong performance on complex tasks through long Chain-of-Thought (CoT) reasoning. However, their lengthy outputs increase computational costs and may lead to overthinking, raising challenges in balancing reasoning effectiveness and efficiency. Current solutions often compromise reasoning quality or require extensive resources. In this paper, we investigate how to reduce the generation length of LRMs with limited tuning.