Weight space Detection of Backdoors in LoRA Adapters

发布

2026年04月08日

采集 2026年04月08日 04:31

学术前沿 6.0 分 — 有一定参考价值的AI研究论文

评分 6.0 · 来源：cs.CL updates on arXiv.org · 发布于 2026-04-08

评分依据：有一定参考价值的AI研究论文

arXiv:2602.15195v3 Announce Type: replace-cross Abstract: LoRA adapters let users fine-tune large language models (LLMs) efficiently. However, LoRA adapters are shared through open repositories like Hugging Face Hub \citep{huggingface_hub_docs}, making them vulnerable to backdoor attacks. Current detection methods require running the model with test input data — making them impractical for screening thousands of adapters where the trigger for backdoor behavior is unknown. We detect poisoned adapters by analyzing their weight matrices directly, without running the model — making our method tr