Skip to content
星际流动

The Cost of Language: Centroid Erasure Exposes and Exploits Modal Competition in Multimodal LLMs

发布
采集
学术前沿 6.0 分 — Reveals modal competition in MLLMs via centroid erasure, important for understanding multimodal model behavior
原文: cs.CL updates on arXiv.org

评分 6 · 来源:cs.CL updates on arXiv.org · 发布于 2026-04-17

评分依据:Reveals modal competition in MLLMs via centroid erasure, important for understanding multimodal model behavior

arXiv:2604.14363v1 Announce Type: new Abstract: Multimodal language models systematically underperform on visual perception tasks, yet the structure underlying this failure remains poorly understood. We propose centroid replacement, collapsing each token to its nearest K-means centroid, as a controlled probe for modal dependence.