评分 6 · 来源:cs.LG updates on arXiv.org · 发布于 2026-04-17
评分依据:Solid mechanistic interpretability work identifying sink circuit components, causal interventions validate findings
arXiv:2604.14722v1 Announce Type: new Abstract: Transformers commonly exhibit an attention sink: disproportionately high attention to the first position. We study this behavior in GPT-2-style models with learned query biases and absolute positional embeddings. Combining structural analysis with causal interventions, validated across natural-language, mathematical, and code inputs, we find that the sink arises from the interaction among (i) a learned query bias, (ii) the first-layer MLP transformation of the positional encoding, and (iii) structure in the key projection.