Academic

Bench-MFG: A Benchmark Suite for Learning in Stationary Mean Field Games

arXiv:2602.12517v1 Announce Type: new Abstract: The intersection of Mean Field Games (MFGs) and Reinforcement Learning (RL) has fostered a growing family of algorithms designed to solve large-scale multi-agent systems. However, the field currently lacks a standardized evaluation protocol, forcing researchers to rely on bespoke, isolated, and often simplistic environments. This fragmentation makes it difficult to assess the robustness, generalization, and failure modes of emerging methods. To address this gap, we propose a comprehensive benchmark suite for MFGs (Bench-MFG), focusing on the discrete-time, discrete-space, stationary setting for the sake of clarity. We introduce a taxonomy of problem classes, ranging from no-interaction and monotone games to potential and dynamics-coupled games, and provide prototypical environments for each. Furthermore, we propose MF-Garnets, a method for generating random MFG instances to facilitate rigorous statistical testing. We benchmark a variety

Lorenzo Magnino, Jiacheng Shen, Matthieu Geist, Olivier Pietquin, Mathieu Lauri\`ere · March 7, 2026 · 1 min read · 18 views

#cs.LG #cs.AI #cs.MA #math.OC

Executive Summary

The article introduces Bench-MFG, a comprehensive benchmark suite for evaluating learning algorithms in stationary Mean Field Games (MFGs). It addresses the lack of standardized evaluation protocols in the field, proposing a taxonomy of problem classes and prototypical environments. The study also introduces MF-Garnets for generating random MFG instances and benchmarks various algorithms, including a novel black-box approach (MF-PSO). The authors provide guidelines for future experimental comparisons and make their code available for further research.

Key Points

▸ Introduction of Bench-MFG to standardize evaluation of MFG algorithms.
▸ Taxonomy of problem classes and prototypical environments for MFGs.
▸ Development of MF-Garnets for generating random MFG instances.
▸ Benchmarking of various algorithms, including MF-PSO.
▸ Proposal of guidelines for future experimental comparisons.

Merits

Comprehensive Benchmark Suite

Bench-MFG provides a standardized and rigorous framework for evaluating MFG algorithms, addressing a significant gap in the field.

Taxonomy and Prototypical Environments

The proposed taxonomy and environments cover a wide range of MFG scenarios, facilitating a thorough assessment of algorithm performance.

Novel Algorithm Introduction

The introduction of MF-PSO offers a new approach to exploitability minimization, contributing to the methodological diversity in the field.

Demerits

Limited Scope

The focus on discrete-time, discrete-space, stationary settings may limit the applicability of the benchmark suite to other MFG scenarios.

Empirical Focus

The study relies heavily on empirical results, which may not fully capture the theoretical nuances of MFG algorithms.

Expert Commentary

The introduction of Bench-MFG represents a significant step forward in the standardization of evaluation protocols for Mean Field Games. The comprehensive benchmark suite addresses a critical gap in the field, providing researchers with a rigorous framework for assessing the robustness and generalization of various algorithms. The proposed taxonomy and prototypical environments cover a wide range of scenarios, ensuring a thorough evaluation of algorithm performance. The development of MF-Garnets for generating random MFG instances further enhances the robustness of the benchmark suite, facilitating rigorous statistical testing. The introduction of MF-PSO offers a novel approach to exploitability minimization, contributing to the methodological diversity in the field. However, the focus on discrete-time, discrete-space, stationary settings may limit the applicability of the benchmark suite to other MFG scenarios. Additionally, the study's reliance on empirical results may not fully capture the theoretical nuances of MFG algorithms. Despite these limitations, Bench-MFG provides a valuable tool for researchers and practitioners, fostering the development of more robust and generalizable MFG algorithms. The guidelines proposed for future experimental comparisons will likely become a standard in the field, ensuring comparability and reproducibility of results. Overall, the study represents a significant advancement in the field of Mean Field Games and multi-agent systems, with broad implications for both practical applications and policy decisions.

Recommendations

✓ Expand the benchmark suite to include continuous-time and continuous-space settings to broaden its applicability.
✓ Incorporate theoretical analyses to complement the empirical results, providing a more comprehensive understanding of algorithm performance.

Sources

arXiv - cs.LG

Bench-MFG: A Benchmark Suite for Learning in Stationary Mean Field Games

AI Commentary

Executive Summary

Key Points

Merits

Comprehensive Benchmark Suite

Taxonomy and Prototypical Environments

Novel Algorithm Introduction

Demerits

Limited Scope

Empirical Focus

Expert Commentary

Recommendations

Sources

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs

JCG, PC

HSOLLC Co., Ltd.

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs