Does RL Expand the Capability Boundary of LLM Agents? A PASS@(k,T) Analysis

发布

2026年04月17日

采集 2026年04月17日 04:31

学术前沿 7.5 分 — Key question for agent capability: introduces PASS@(k,T) metric separating capability expansion from efficiency, extends static reasoning findings to agentic tool use

评分 7.5 · 来源：cs.LG updates on arXiv.org · 发布于 2026-04-17

评分依据：Key question for agent capability: introduces PASS@(k,T) metric separating capability expansion from efficiency, extends static reasoning findings to agentic tool use

arXiv:2604.14877v1 Announce Type: new Abstract: Does reinforcement learning genuinely expand what LLM agents can do, or merely make them more reliable? For static reasoning, recent work answers the second: base and RL pass@k curves converge at large k. We ask whether this holds for agentic tool use, where T rounds of interaction enable compositional strategies that re-sampling cannot recover. We introduce PASS@(k,T), a two-dimensional metric that jointly varies sampling budget k and interaction depth T, separating capability expansion from efficiency improvement.