HorizonMath introduces a benchmark of over 100 predominantly unsolved problems in computational mathematics with automated verification. GPT 5.4 Pro proposes solutions that improve on the best-known published results for two problems, representing potential novel contributions.
HorizonMath: Measuring AI Progress Toward Mathematical Discovery
发布
8.2 分
— 重要 benchmark 发布,GPT 5.4 Pro 在数学发现上的突破性进展,开源可复现