Tag: Deep Research

All the articles with the tag "Deep Research".

5.5
DRBENCHER: Can Your Agent Identify the Entity, Retrieve Its Properties and Do the Math?
2026年04月13日
· arXiv cs.AI· 04/13 12:31 采集
DRBENCHER 是一个合成 benchmark 生成器，专门生成需要同时进行网页浏览和多步计算的深度研究问题，用于评估 deep research agent 在真实研究场景中的表现。
8.0
LLM 与 Deep Research Agent 的引用幻觉检测与修正：3-13% 引用 URL 为捏造
2026年04月06日
· arXiv cs.CL· 04/06 12:33 采集
系统性评估 10 个模型和 Agent 的引用可靠性，发现 3-13% 的引用 URL 为幻觉（从未存在），Deep Research Agent 生成更多幻觉但修正率也更高。
7.0
Self-Optimizing Multi-Agent Systems：多智能体 Deep Research 系统的自我优化
2026年04月06日
· arXiv cs.AI· 04/06 12:33 采集
探索多种多智能体优化方法用于 Deep Research 系统，通过自动化 Prompt 优化和架构搜索提升研究质量，减少人工调参依赖。

DRBENCHER: Can Your Agent Identify the Entity, Retrieve Its Properties and Do the Math?