This platform requires JavaScript for full functionality. Please enable JavaScript in your browser settings.

Quality follows upgrading

Antoine Peyronnet, Fabian Gloeckle, Amaury Hayat

Articles by Antoine Peyronnet, Fabian Gloeckle, Amaury Hayat

Academic · 1 min

LemmaBench: A Live, Research-Level Benchmark to Evaluate LLM Capabilities in Mathematics

arXiv:2602.24173v1 Announce Type: new Abstract: We present a new approach for benchmarking Large Language Model (LLM) capabilities on research-level mathematics. Existing benchmarks largely rely on …

3 views Mar 7

Antoine Peyronnet, Fabian Gloeckle, Amaury Hayat

Articles by Antoine Peyronnet, Fabian Gloeckle, Amaury Hayat

LemmaBench: A Live, Research-Level Benchmark to Evaluate LLM Capabilities in Mathematics

JCG, PC

HSOLLC Co., Ltd.