MM-tau-p$^2$: Persona-Adaptive Prompting for Robust Multi-Modal Agent Evaluation in Dual-Control Settings

发布

2026年04月08日

采集 2026年04月08日 04:31

学术前沿 6.4 分 — 有一定参考价值的AI研究论文

评分 6.4 · 来源：cs.AI updates on arXiv.org · 发布于 2026-04-08

评分依据：有一定参考价值的AI研究论文

arXiv:2603.09643v4 Announce Type: replace-cross Abstract: Current evaluation frameworks and benchmarks for LLM powered agents focus on text chat driven agents, these frameworks do not expose the persona of user to the agent, thus operating in a user agnostic environment. Importantly, in customer experience management domain, the agent’s behaviour evolves as the agent learns about user personality. With proliferation of real time TTS and multi-modal language models, LLM based agents are gradually going to become multi-modal. Towards this, we propose the MM-tau-p$^2$ benchmark with metrics for e