Measuring the Permission Gate: A Stress-Test Evaluation of Claude Code's Auto Mode

发布

2026年04月08日

采集 2026年04月08日 04:31

学术前沿 7.7 分 — Claude Code自动模式权限测试，直接实用

评分 7.7 · 来源：cs.AI updates on arXiv.org · 发布于 2026-04-08

评分依据：Claude Code自动模式权限测试，直接实用

arXiv:2604.04978v1 Announce Type: cross Abstract: Claude Code’s auto mode is the first deployed permission system for AI coding agents, using a two-stage transcript classifier to gate dangerous tool calls. Anthropic reports a 0.4% false positive rate and 17% false negative rate on production traffic. We present the first independent evaluation of this system on deliberately ambiguous authorization scenarios, i.e., tasks where the user’s intent is clear but the target scope, blast radius, or risk level is underspecified. Using AmPermBench, a 128-prompt benchmark spanning four DevOps task famili