Skip to content
星际流动

OutSafe-Bench: A Benchmark for Multimodal Offensive Content Detection in Large Language Models

发布
采集
学术前沿 6.4 分 — 有一定参考价值的AI研究论文
原文: cs.CL updates on arXiv.org

评分 6.4 · 来源:cs.CL updates on arXiv.org · 发布于 2026-04-08

评分依据:有一定参考价值的AI研究论文

arXiv:2511.10287v4 Announce Type: replace-cross Abstract: Since Multimodal Large Language Models (MLLMs) are increasingly being integrated into everyday tools and intelligent agents, growing concerns have arisen regarding their possible output of unsafe contents, ranging from toxic language and biased imagery to privacy violations and harmful misinformation. Current safety benchmarks remain highly limited in both modality coverage and performance evaluations, often neglecting the extensive landscape of content safety. In this work, we introduce OutSafe-Bench, the first most comprehensive conte


标签: