Skip to content
星际流动

LLMs Know They're Wrong and Agree Anyway: The Shared Sycophancy-Lying Circuit

发布
采集
学术前沿 8.0 分 — Landmark discovery identifying shared attention heads for error-detection and sycophancy across 12 models from 5 labs, major interpretability result
原文: cs.LG updates on arXiv.org

评分 8 · 来源:cs.LG updates on arXiv.org · 发布于 2026-04-29

评分依据:Landmark discovery identifying shared attention heads for error-detection and sycophancy across 12 models from 5 labs, major interpretability result