评分 7.5 · 来源:cs.LG updates on arXiv.org · 发布于 2026-04-17
评分依据:Key question for agent capability: introduces PASS@(k,T) metric separating capability expansion from efficiency, extends static reasoning findings to agentic tool use
arXiv:2604.14877v1 Announce Type: new Abstract: Does reinforcement learning genuinely expand what LLM agents can do, or merely make them more reliable? For static reasoning, recent work answers the second: base and RL pass@k curves converge at large k. We ask whether this holds for agentic tool use, where T rounds of interaction enable compositional strategies that re-sampling cannot recover. We introduce PASS@(k,T), a two-dimensional metric that jointly varies sampling budget k and interaction depth T, separating capability expansion from efficiency improvement.