Skip to content
星际流动

IBISAgent: Reinforcing Pixel-Level Visual Reasoning in MLLMs for Universal Biomedical Object Referring and Segmentation

发布
采集
学术前沿 6.4 分 — 有一定参考价值的AI研究论文
原文: cs.AI updates on arXiv.org

评分 6.4 · 来源:cs.AI updates on arXiv.org · 发布于 2026-04-08

评分依据:有一定参考价值的AI研究论文

arXiv:2601.03054v4 Announce Type: replace-cross Abstract: Recent research on medical MLLMs has gradually shifted its focus from image-level understanding to fine-grained, pixel-level comprehension. Although segmentation serves as the foundation for pixel-level understanding, existing approaches face two major challenges. First, they introduce implicit segmentation tokens and require simultaneous fine-tuning of both the MLLM and external pixel decoders, which increases the risk of catastrophic forgetting and limits generalization to out-of-domain scenarios. Second, most methods rely on single-p


标签: