Skip to content
星际流动

Mechanistic Decoding of Cognitive Constructs in LLMs

发布
采集
学术前沿 5.5 分 — Mechanistic interpretability approach to decoding cognitive constructs from LLMs, bridges cognitive science and MI
原文: cs.CL updates on arXiv.org

评分 5.5 · 来源:cs.CL updates on arXiv.org · 发布于 2026-04-17

评分依据:Mechanistic interpretability approach to decoding cognitive constructs from LLMs, bridges cognitive science and MI

arXiv:2604.14593v1 Announce Type: new Abstract: While Large Language Models (LLMs) demonstrate increasingly sophisticated affective capabilities, the internal mechanisms by which they process complex emotions remain unclear. Existing interpretability approaches often treat models as black boxes or focus on coarse-grained basic emotions, leaving the cognitive structure of more complex affective states underexplored. To bridge this gap, we propose a Cognitive Reverse-Engineering framework based on Representation Engineering (RepE) to analyze social-comparison jealousy.