HorizonMath: Measuring AI Progress Toward Mathematical Discovery

发布

2026年03月16日

8.2 分 — 重要 benchmark 发布，GPT 5.4 Pro 在数学发现上的突破性进展，开源可复现

HorizonMath introduces a benchmark of over 100 predominantly unsolved problems in computational mathematics with automated verification. GPT 5.4 Pro proposes solutions that improve on the best-known published results for two problems, representing potential novel contributions.