Skip to content
星际流动

Chain-of-Thought Degrades Visual Spatial Reasoning Capabilities of Multimodal LLMs

发布
采集
学术前沿 6.0 分 — CoT降低多模态LLM视觉空间推理能力:重要负面发现,影响多模态推理范式选择
原文: cs.AI updates on arXiv.org

评分 6 · 来源:cs.AI updates on arXiv.org · 发布于 2026-04-20

评分依据:CoT降低多模态LLM视觉空间推理能力:重要负面发现,影响多模态推理范式选择

要点

arXiv:2604.16060v1 Announce Type: cross Abstract: Multimodal Reasoning Models (MRMs) leveraging Chain-of-Thought (CoT) based thinking have revolutionized mathematical and logical problem-solving. However, we show that this paradigm struggles with generalized spatial intelligence. We perform a comprehensive evaluation of seventeen models across thirteen spatial benchmarks and identify a critical gap: CoT prompting consistently degrades performance in visual spatial reasoning. Furthermore, through a novel No-Image++ ablation, we demonstrate that MRMs and CoT prompted MLMs suffer from severe shor…

🤖 AI 点评

本文提供了AI领域的重要信息,值得行业从业者关注。


标签: