AgentHER: Hindsight Experience Replay for LLM Agent Trajectory Relabeling

发布

2026年04月08日

采集 2026年04月08日 04:31

学术前沿 6.4 分 — 有一定参考价值的AI研究论文

评分 6.4 · 来源：cs.CL updates on arXiv.org · 发布于 2026-04-08

评分依据：有一定参考价值的AI研究论文

arXiv:2603.21357v2 Announce Type: replace-cross Abstract: LLM agents fail on the majority of real-world tasks — GPT-4o succeeds on fewer than 15% of WebArena navigation tasks and below 55% pass@1 on ToolBench (Zhou et al., 2024; Qin et al., 2024) — yet every failed trajectory is routinely discarded, wasting the dominant source of collected experience. We introduce AgentHER, a framework that recovers this lost training signal by adapting the Hindsight Experience Replay (HER; Andrychowicz et al., 2017) principle to natural-language agent trajectories for offline data augmentation. The key insi