评分 6.7 · 来源:cs.AI updates on arXiv.org · 发布于 2026-04-08
评分依据:编码Agent的工具输出裁剪优化
arXiv:2604.04979v1 Announce Type: cross Abstract: Coding agents repeatedly consume long tool observations even though only a small fraction of each observation matters for the next step. We study task-conditioned tool-output pruning: given a focused query and one tool output, return the smallest verbatim evidence block the agent should inspect next. We introduce a benchmark of 11,477 examples built from SWE-bench repository interactions and synthetic multi-ecosystem tool outputs, with a manually curated 618-example test set. We fine-tune Qwen 3.5 2B with LoRA and compare it against larger zero