Skip to content
星际流动

Multi-ORFT: Stable Online Reinforcement Fine-Tuning for Multi-Agent Diffusion Planning in Cooperative Driving

发布
采集
学术前沿 5.7 分 — 中等偏上:有一定信息增量和参考价值
原文: cs.AI updates on arXiv.org

评分 5.7 · 来源:cs.AI updates on arXiv.org · 发布于 2026-04-14

评分依据:中等偏上:有一定信息增量和参考价值

Multi-ORFT: Stable Online Reinforcement Fine-Tuning for Multi-Agent Diffusion Planning in Cooperative Driving

arXiv:2604.11734v1 Announce Type: cross Abstract: Closed-loop cooperative driving requires planners that generate realistic multimodal multi-agent trajectories while improving safety and traffic efficiency. Existing diffusion planners can model multimodal behaviors from demonstrations, but they often exhibit weak scene consistency and remain poorly aligned with closed-loop objectives; meanwhile, stable online post-training in reactive multi-agent environments remains difficult. We present…