VeriSoftBench: Repository-Scale Formal Verification Benchmarks for Lean
arXiv:2602.18307v1 Announce Type: cross Abstract: Large language models have achieved striking results in interactive theorem proving, particularly in Lean. However, most benchmarks for LLM-based proof …
Yutong Xin, Qiaochu Chen, Greg Durrett, I\c{s}il Dillig
6 views