Tag: computer-use

All the articles with the tag "computer-use".

7.7
AgentHazard：首个 Computer-Use Agent 有害行为评估基准
2026年04月06日
· arXiv cs.AI· 04/06 12:33 采集
提出首个系统性评估计算机使用 Agent 有害行为的基准，关注局部合理步骤如何串联为全局有害行为的新安全挑战。
6.8
WebTestBench：面向端到端自动 Web 测试的 Computer-Use Agent 评估
2026年03月27日
· cs.CL updates on arXiv.org· 03/27 12:31 采集
评估 Computer-Use Agent 自动验证 Web 功能实现是否可靠的基准测试，填补了 Vibe Coding 时代的测试空白。
7.4
CUA-Suite：大规模人工标注视频演示数据集，推动 Computer-Use Agent
2026年03月26日
· cs.LG updates on arXiv.org· 03/26 14:33 采集
百万级连续视频演示数据集，解决 CUA 训练数据稀缺的核心瓶颈
5.0
Anthropic宣布Claude已可操控用户电脑完成任务
2026年03月25日
· 36氪· 03/25 08:31 采集
Anthropic发布Claude Computer Use正式版，用户可通过手机发送任务指令让Claude操控电脑
5.0
Christopher Mims锐评：让AI完全控制电脑将 retrospect 起来很愚蠢
2026年03月25日
· Simon Willison· 03/25 08:31 采集
WSJ记者Christopher Mims对AI完全控制用户电脑的趋势发出警告
8.0
OpenAI 发布 GPT-5.4：首个原生 Computer Use 通用模型
2026年03月22日
· OpenAI· 03/22 14:45 采集
GPT-5.4 在编程、Agent 工作流和通用推理上全面超越前代，OSWorld 75% 超越人类基线，GDPval 83% 覆盖 44 个职业
7.7
Holotron-12B：高吞吐 Computer Use Agent 开源模型
2026年03月17日
· Hugging Face· 03/18 08:35 采集
H Company 联合 NVIDIA 发布 12B 参数 Computer Use 代理模型，SSM 混合架构实现 2 倍吞吐提升
7.8
Holotron-12B：高吞吐量计算机使用 Agent
2026年03月17日
· Hugging Face Blog· 03/17 14:39 采集
H Company 发布 Holotron-12B，专注于高吞吐量计算机操作的 AI Agent 模型
7.8
Holotron-12B - High Throughput Computer Use Agent
2026年03月17日
H Company releases Holotron-12B, a high throughput computer use agent model available on Hugging Face.
8.5
Elon Musk 发布 Macrohard：Tesla + xAI 联合打造「Digital Optimus」软件公司替代者
2026年03月12日
· Reuters
马斯克宣布 Macrohard 项目：Grok 作为「导航员」，Digital Optimus 处理计算机屏幕交互，目标是「模拟整个公司的功能」。
5.5
Anthropic 收购 Vercept：为 Claude 的 Computer Use 能力加码
2026年02月25日
· 03/24 22:33 采集
Anthropic 收购 Vercept 以增强 Claude 的计算机使用能力

Tag: computer-use

AgentHazard：首个 Computer-Use Agent 有害行为评估基准

WebTestBench：面向端到端自动 Web 测试的 Computer-Use Agent 评估

CUA-Suite：大规模人工标注视频演示数据集，推动 Computer-Use Agent

Anthropic宣布Claude已可操控用户电脑完成任务

Christopher Mims锐评：让AI完全控制电脑将 retrospect 起来很愚蠢

OpenAI 发布 GPT-5.4：首个原生 Computer Use 通用模型

Holotron-12B：高吞吐 Computer Use Agent 开源模型

Holotron-12B：高吞吐量计算机使用 Agent

Holotron-12B - High Throughput Computer Use Agent

Elon Musk 发布 Macrohard：Tesla + xAI 联合打造「Digital Optimus」软件公司替代者

Anthropic 收购 Vercept：为 Claude 的 Computer Use 能力加码