Skip to content
星际流动

World-Value-Action Model: Implicit Planning for Vision-Language-Action Systems

发布
采集
学术前沿 6.5 分 — Implicit planning for VLA systems, addresses key limitation of direct action prediction in embodied AI
原文: cs.LG updates on arXiv.org

评分 6.5 · 来源:cs.LG updates on arXiv.org · 发布于 2026-04-17

评分依据:Implicit planning for VLA systems, addresses key limitation of direct action prediction in embodied AI

arXiv:2604.14732v1 Announce Type: cross Abstract: Vision-Language-Action (VLA) models have emerged as a promising paradigm for building embodied agents that ground perception and language into action. However, most existing approaches rely on direct action prediction, lacking the ability to reason over long-horizon trajectories and evaluate their consequences, which limits performance in complex decision-making tasks. In this work, we introduce World-Value-Action (WAV) model, a unified framework that enables implicit planning in VLA systems.