TML-Bench: Benchmark for Data Science Agents on Tabular ML Tasks
arXiv:2603.05764v1 Announce Type: new Abstract: Autonomous coding agents can produce strong tabular baselines quickly on Kaggle-style tasks. Practical value depends on end-to-end correctness and reliability …
Mykola Pinchuk
10 views