Conference

Competitions

· · 10 min read · 18 views

San Diego Mexico City Competitions 18 Events Competition MyoChallenge 2025: Towards Human Athletic Intelligence Vittorio Caggiano · Huiyi Wang · Chun Kwang Tan · Balint Hodossy · Shirui Lyu · Massimo Sartori · Seungmoon Song · Letizia Gionfrida · Guillaume Durandau · Vikash Kumar Dec 6, 8:00 AM - 10:45 AM Upper Level Ballroom 6DE Athletic performance represents the pinnacle of human decision-making. It demands rapid choices, precise motor control, agility, and coordinated physical execution. Such a combination of capabilities remains elusive in current artificial intelligence and robotic systems.Building on the momentum of the MyoChallenge at NeurIPS 2022, 2023, and 2024, the 4th edition of our MyoChallenge series -- Towards Human Athletic Intelligence-- moves toward capturing the full expressivity and agility of human athletic performance. Participants will develop behaviors for physiologically realistic musculoskeletal models performing fast-paced, and high-skill athletic tasks.The challenge will feature two tracks. First, a Soccer shootout. A full-body musculoskeletal model must dynamically approach and shoot a ball past a moving goalkeeper. Success requires balance, foot targeting, force generation, and rapid whole-body coordination. Second, a Table Tennis competition. A musculoskeletal model of the upper body (arm and trunk) must track, strike, and return balls in a fast-paced table tennis rally against an AI opponent. These challenges go far beyond static or repetitive motions. They demand generalization via reactive and adaptive embodied behavior grounded in the physics of muscle, tendon, and joint dynamics, with real-time perception-action loops capable of agile motor control. The challenge will be staged in the commonly used MyoSuite framework, which offers physiologically accurate, state-of-the-art musculoskeletal models, an intuitive interface to scalable reinforcement learning and control libraries. The framework also enables easy onboarding via extensive tutorials and getting-started materials, and access to multiple baseline libraries needed for the challenge.The competition aims to engage diverse research communities: biomechanics, motor neuroscience, reinforcement learning, control theory, and more. As in previous years, it will prioritize scalability, reproducibility, and generalization, and be open-sourced following best engineering and academic practices to advance physiological control and bring us closer to replicating human athletic intelligence. Show more View full details Competition The Competition of Fairness in AI Face Detection Shu Hu · Xin Wang · Daniel Schiff · Sachi mohanty · Ryan Ofman · Wenbin Zhang · Baoyuan Wu · Cristian Ferrer · Xiaoming Liu · Luisa Verdoliva · Siwei Lyu Dec 6, 8:00 AM - 10:45 AM Mezzanine Room 15AB This competition focuses on advancing fairness-aware detection of AI-generated (deepfake) faces and promoting new methodological innovations, addressing a critical gap where fairness methods developed in machine learning have been largely overlooked in deepfake detection. In the competition, participants will work with two large-scale datasets provided by the organizers: AI-Face (CVPR 2025), a million-scale, demographically annotated dataset for training and validation, and PDID (AAAI 2024), a newly curated dataset comprising real-world deepfake incidents, reserved for testing. Participants are tasked with developing models that achieve strong utility performance (e.g., AUC) while ensuring fairness generalization under real-world deployment conditions. The baseline method, PG-FDD (published at CVPR 2024 from the organizer’s group), which demonstrates state-of-the-art performance in fairness generalization for AI face detection, will be provided to support participation.The competition’s potential impact includes fostering the development of robust, fair, and generalizable deepfake detectors, raising awareness of fairness challenges in combating AI-generated fakes, and promoting responsible AI and machine learning deployment in societal applications such as media forensics and digital identity verification. Our competition is fortunately sponsored by Deep Media AI and Originality.AI companies. The challenge link is https://sites.google.com/view/aifacedetection/home. Show more View full details Competition EEG Foundation Challenge: From Cross-Task to Cross-Subject EEG Decoding Bruno Aristimunha · Dung Truong · Pierre Guetschel · Seyed (Yahya) Shirazi · Isabelle Guyon · Alexandre Franco · Michael Milham · Aviv Dotan · Scott Makeig · Alex Gramfort · Jean-Remi King · Marie-Constance Corsi · Pedro Valdés-Sosa · Amitava Majumdar · Alan Evans · Terrence Sejnowski · Oren Shriki · Sylvain Chevallier · Arnaud Delorme Dec 6, 11:00 AM - 1:45 PM Mezzanine Room 15AB Current electroencephalogram (EEG) decoding models are typically trained on specific subjects and specific tasks. Here, we introduce a large-scale, code-submission-based competition to subsume this approach through two challenges. First, the transfer challenge consists of building a model that can zero-shot decode new tasks and new subjects from their EEG. Second, the psychopathology factor prediction challenge consists of predicting measures of mental health from EEG data. For this, we use an unprecedented, multi-terabyte dataset of high-density EEG signals (128 channels) recorded from over 3,000 subjects engaged in multiple active and passive tasks. We provide several tunable neural network baselines for each of these two challenges, including a simple network and demographic-based regression models. Developing models that generalize across tasks and individuals will pave the way for EEG architectures capable of adapting to diverse tasks and individuals. Similarly, predicting mental health dimensions from EEG will be essential to systematically identify objective biomarkers for clinical diagnosis and personalized treatment. Ultimately, the advances spurred by this challenge are poised to shape the future of neurotechnology and computational psychiatry, catalyzing breakthroughs in both fundamental neuroscience and applied clinical research. Show more View full details Competition DCVLR: Data Curation for Vision Language Reasoning Benjamin Feuer · Rohun Tripathi · Oussama Elachqar · Yuhui Zhang · Neha Hulkund · Thao Nguyen · Vishaal Udandarao · Xiaohan Wang · Sara Beery · Georgia Gkioxari · Emmanouil Koukoumidis · Paul Liang · Ludwig Schmidt · Saining Xie · Serena Yeung-Levy Dec 6, 11:00 AM - 1:45 PM Upper Level Ballroom 6DE We propose a new data-centric competition that aims to advance the visual reasoning capabilities of vision-language models (VLMs) through instruction-tuning dataset curation. Participants are provided with a pool of 1 million image-text pairs and tasked with generating a small (1K) or large (10K) instruction-tuning dataset using any method of their choice. Submissions will be evaluated by fine-tuning a fixed VLM (Molmo) on the curated data and measuring performance on VMCBench, a newly released benchmark composed of multiple-choice visual reasoning questions spanning six diverse datasets.The competition provides all necessary resources, including the image-text pool, fine-tuning scripts, evaluation code, and baselines generated using GPT-4o and Claude, as well as 400 USD GPU compute from Lambda Labs. The evaluation metric is accuracy, and all training and evaluation will be reproduced by organizers on standardized infrastructure. This challenge reframes data curation as the primary variable for scientific investigation, with implications for adapting foundation models to real-world domains such as education, biomedicine, and scientific reasoning.We aim to foster broad participation across academia and industry, democratizing model adaptation by focusing on data quality rather than computational scale. Show more View full details Competition CURE-Bench: Competition on Reasoning Models for Drug Decision-Making in Precision Therapeutics Shanghua Gao · Richard Zhu · Zhenglun Kong · Xiaorui Su · Curtis Ginder · Sufian Aldogom · Ishita Das · Taylor Evans · Theodoros Tsiligkaridis · Marinka Zitnik Dec 6, 2:00 PM - 4:45 PM Upper Level Ballroom 6DE Precision therapeutics require models that can reason over complex relationships between patients, diseases, and drugs. Large language models and large reasoning models, especially when combined with external tool use and multi-agent coordination, have demonstrated the potential to perform structured, multi-step reasoning in clinical settings. However, existing benchmarks (mostly QA benchmarks) do not evaluate these capabilities in the context of real-world therapeutic decision-making. We present CURE-Bench, a competition and benchmark for evaluating AI models in drug decision-making and treatment planning. CURE-Bench includes clinically grounded tasks such as recommending treatments, assessing drug safety and efficacy, designing treatment plans, and identifying repurposing opportunities for diseases with limited therapeutic options. The competition has two tracks: one for models reasoning using internal knowledge, and another one for agentic reasoning that integrates external tools and real-time information. Evaluation data are generated using a validated multi-agent pipeline that produces realistic questions, reasoning traces, and tool-based solutions. Participants will have access to baselines spanning both open-weight and API-based models, along with standardized metrics for correctness, factuality, interpretability, and robustness. Human expert evaluation provides an additional layer of validation. CURE-Bench provides a rigorous, reproducible competition framework for assessing the performance, robustness, and interpretability of reasoning models in high-stakes clinical applications. It will accelerate the development of therapeutic AI and foster collaboration between AI and therapeutics communities. Show more View full details Competition Open Polymer Challenge: Leveraging Machine Learning for Polymer Informatics Gang Liu · Sobin Alosious · Yuhan Liu · Eric Inae · Yihan Zhu · Renzheng Zhang · Jiaxin Xu · Addison Howard · Ying Li · Tengfei Luo · Meng Jiang Dec 6, 2:00 PM - 4:45 PM Mezzanine Room 15AB Machine learning (ML) holds immense potential for discovering sustainable polymer materials, yet progress is hindered by the lack of high-quality open data. We provide an open-sourced dataset that is ten times larger than existing ones, along with competitive ML baselines and evaluation pipelines. This challenge targets multi-task polymer property prediction, which is crucial for virtual screening of polymers.Participants are asked to develop accurate prediction models, with a focus on material properties. A variety of ML techniques such as data augmentation and imbalanced learning, sophisticated learning paradigms like transfer learning and self-supervised learning, and novel model architectures with a good inductive bias on polymers can be leveraged. The competition results will directly accelerate the discovery of novel polymers for sustainable and energy-saving materials. Show more View full details Competition Early Training Scientific Knowledge and Reasoning Evaluation of Small Language Models Mouadh Yagoubi · YASSER ABDELAZIZ DAHOU DJILALI · Billel Mokeddem · Younes Belkada · Phúc Lê Khắc · Basma Boussaha · REDA ALAMI · Jingwei Zuo · Damiano Marsili · Mugariya Farooq · Mounia Lalmas · Georgia Gkioxari · Patrick Gallinari · Philip Torr · Hakim Hacid Dec 7, 8:00 AM - 10:45 AM Upper Level Ballroom 6B Existing benchmarks have proven effective for assessing the performance of fully trained large language models. However, we find striking differences in the early training stages of small models, where benchmarks often fail to provide meaningful or discriminative signals. To explore how these differences arise, this competition tackles the challenge of designing scientific knowledge evaluation tasks specifically tailored for measuring early training progress of language models. Participants are invited to develop novel evaluation methodologies or adapt existing benchmarks to better capture performance differences among language models.To support this effort, we provide three pre-trained small models (0.5B, 1B, and 3B parameters), along with intermediate checkpoints sampled during training up to 200B tokens. All experiments and development work can be run on widely available free cloud-based GPU platforms, making participation accessible to researchers with limited computational resources. Submissions will be evaluated based on three criteria: the quality of the performance signal they produce, the consistency of model rankings at 1 trillion tokens of training, and their relevance to the scientific knowledge domain. By promoting the design of tailored evaluation strategies for early training, this competition aims to attract a broad range of participants from various disciplines, including those who may not be machine learning experts or have access to dedicated GPU resources. Ultimately, this initiative seeks to make foundational LLM research more systematic and benchmark-informed from the earliest phases of model development. Show more View full details Competition The MindGames Challenge: Theory-of-Mind and Game Intelligence in LLM Agents Kevin Wang · Jianzhu Yao · Yihan Jiang · Benjamin Finch · Viraj Nadkarni · Benjamin Kempinski · Anna C. M. Thöni · Mathieu Lauriere · Maria Polukarov · Pramod Viswanath · Tal Kachman · Yoram Bachrach · Zhangyang "Atlas" Wang Dec 7, 8:00 AM - 10:45 AM Upper Level Ballroom 6CF Recent breakthroughs in large language models have revolutionized natural language processing and spawned new classes of multi-agent AI systems. Yet, essential gaps remain in such systems' abilities to model beliefs, detect deception, coordinate effectively under uncertainty, and planning in longer-term dynamic environments ---collectively known as ``theory-of-mind" capacities. The MindGames Challenge seeks to address these gaps by testing and advancing the cooperative intelligence of LLM agents across multiple distinct social-deduction and coordination tasks. Participants will develop agents that (i) communicate via natural language, (ii) reason about hidden states and competing objectives, and (iii) dynamically adapt strategies in repeated and iterative interactions Show more View full details Competition The PokéAgent Challenge: Competitive and Long-Context Learning at Scale Seth Karten · Jake Grigsby · Stephanie Milani · Kiran Vodrahalli · Amy Zhang · Fei Fang · Yuke Zhu · Chi Jin Dec 7, 8:00 AM - 10:45 AM Mezzanine Room 15AB While frontier AI models excel at language understanding, math reasoning, and code generation, they underperform in out-of-distribution generalization, adaptation to strategic opponents, game-theoretic decision-making, and long-context reasoning and planning. To address these gaps, we introduce the PokéAgent Challenge,

Executive Summary

This article presents two competitions aimed at advancing artificial intelligence (AI) and athletic performance. The MyoChallenge 2025 seeks to capture the full expressivity and agility of human athletic performance through physiologically realistic musculoskeletal models performing fast-paced and high-skill athletic tasks. The competition features two tracks: a Soccer shootout and a Table Tennis competition. In contrast, the Competition of Fairness in AI Face Detection aims to advance fairness-aware detection of AI-generated faces and promote new methodological innovations. Both competitions prioritize scalability, reproducibility, and generalization, and will be open-sourced following best engineering and academic practices. These competitions have the potential to drive innovation and advancement in AI, athletic performance, and fairness in AI applications.

Key Points

  • MyoChallenge 2025 aims to capture human athletic performance through musculoskeletal models
  • Competition features two tracks: Soccer shootout and Table Tennis competition
  • Competition of Fairness in AI Face Detection aims to advance fairness-aware detection
  • Both competitions prioritize scalability, reproducibility, and generalization
  • Competitions will be open-sourced following best engineering and academic practices

Merits

Strength: Interdisciplinary Approach

Both competitions bring together researchers from diverse fields, including biomechanics, motor neuroscience, reinforcement learning, control theory, and more. This interdisciplinary approach has the potential to drive innovation and advancement in AI, athletic performance, and fairness in AI applications.

Demerits

Limitation: Complexity

The complexity of the challenges and the need for physiologically realistic musculoskeletal models may limit the participation of researchers without extensive expertise in these areas.

Expert Commentary

The two competitions presented in this article have the potential to drive significant innovation and advancement in AI, athletic performance, and fairness in AI applications. The interdisciplinary approach and emphasis on scalability, reproducibility, and generalization are particularly noteworthy. However, the complexity of the challenges and the need for physiologically realistic musculoskeletal models may limit the participation of researchers without extensive expertise in these areas. As such, it will be important to consider strategies for increasing accessibility and inclusivity in these competitions.

Recommendations

  • Encourage participation from researchers with diverse backgrounds and expertise to foster a more inclusive and interdisciplinary approach.
  • Develop strategies for making the challenges and requirements more accessible to researchers without extensive expertise in physiologically realistic musculoskeletal models.

Sources

Related Articles