GameplayQA: A Benchmarking Framework for Decision-Dense POV-Synced Multi-Video Understanding of 3D Virtual Agents

发布

2026年04月14日

采集 2026年04月14日 04:31

学术前沿 5.3 分 — 中等质量：常规学术论文，有适度参考价值

评分 5.3 · 来源：cs.AI updates on arXiv.org · 发布于 2026-04-14

评分依据：中等质量：常规学术论文，有适度参考价值

GameplayQA: A Benchmarking Framework for Decision-Dense POV-Synced Multi-Video Understanding of 3D Virtual Agents

arXiv:2603.24329v2 Announce Type: replace-cross Abstract: Multimodal LLMs are increasingly deployed as perceptual backbones for autonomous agents in 3D environments, from robotics to virtual worlds. These applications require agents to perceive rapid state changes, attribute actions to the correct entities, and reason about concurrent multi-agent behaviors from a first-person perspective, capabilities that existing benchmarks do not adequately evaluate. We introduce GameplayQA, a framework for…