Skip to content
星际流动

FreakOut-LLM: The Effect of Emotional Stimuli on Safety Alignment

发布
采集
学术前沿 6.3 分 — 情绪刺激对LLM安全对齐的影响
原文: cs.AI updates on arXiv.org

评分 6.3 · 来源:cs.AI updates on arXiv.org · 发布于 2026-04-08

评分依据:情绪刺激对LLM安全对齐的影响

arXiv:2604.04992v1 Announce Type: cross Abstract: Safety-aligned LLMs go through refusal training to reject harmful requests, but whether these mechanisms remain effective under emotionally charged stimuli is unexplored. We introduce FreakOut-LLM, a framework investigating whether emotional context compromises safety alignment in adversarial settings. Using validated psychological stimuli, we evaluate how emotional priming through system prompts affects jailbreak susceptibility across ten LLMs. We test three conditions (stress, relaxation, neutral) using scenarios from established psychologica


标签: