Conference

Tutorials

· · 7 min read · 21 views

San Diego Mexico City Tutorials 14 Events Tutorial Human-AI Alignment: Foundations, Methods, Practice, and Challenges Hua Shen · Mitchell Gordon · Adam Tauman Kalai · Yoshua Bengio Dec 2, 9:30 AM - 12:00 PM Exhibit Hall F View full details Tutorial Model Merging: Theory, Practice and Applications Marco Ciccone · Malikeh Ehghaghi · Colin Raffel Dec 2, 9:30 AM - 12:00 PM Upper Level Room 30A-E View full details Tutorial New Frontiers of Hyperparameter Optimization: Recent advances and open challenges in theory and practice Dravyansh Sharma · Colin White · Maria-Florina Balcan Dec 2, 9:30 AM - 12:00 PM Upper Level Ballroom 6CDEF Machine learning algorithms operate on data, and for any task the most effective method depends on the data at hand. Hyperparameter optimization and algorithm selection are therefore crucial to ensure the best performance in terms of accuracy, efficiency, reliability, interpretability, etc. We first survey common techniques used in practice for hyperparameter optimization in machine learning including Bayesian optimization and bandit-based approaches. We next discuss new approaches developed in the context of Large Language Models, including neural scaling laws and parameterization-aware methods. We will discuss chief advantages and shortcomings of these approaches, in particular their limited theoretical guarantees. We will then discuss exciting new developments on hyperparameter tuning with strong theoretical guarantees. A growing line of work over the past decade from the learning theory community has successfully analysed how the algorithmic performance actually varies with the hyperparameter for several fundamental algorithms in machine learning including decision trees, linear regression, unsupervised and semi-supervised learning, and very recently even deep learning. This has allowed the development of techniques that take this structure into account, apply naturally to both hyperparameter tuning and algorithm selection, work well in dynamic or online learning environments, and are equipped with provable PAC (probably approximately correct) guarantees for the generalization error of the learned hyperparameter. Future research areas include integration of these structure-aware principled approaches with the currently used techniques, better optimization in high-dimensional and discrete spaces, and improving scalability in distributed settings. Show more View full details Tutorial Energy and Power as First-Class ML Design Metrics Jae-Won Chung · Ahmet Inci · Ruofan Wu Dec 2, 9:30 AM - 12:00 PM Upper Level Ballroom 6AB View full details Tutorial Planning in the Era of Language Models Michael Katz · Harsha Kokel · Christian Muise Dec 2, 9:30 AM - 12:00 PM Upper Level Ballroom 20AB For over six decades, the field of automated planning has been at the heart of AI, empowering intelligent systems to reason, act, and achieve goals in complex, dynamic environments. From robotics and logistics to space exploration, planning research has fueled autonomous decision-making in real-world applications. Today, as large language models redefine what’s possible in AI, the principles and methodologies of planning are more vital than ever. The planning community brings decades of experience in designing, benchmarking, and interpreting intelligent behavior; expertise that can accelerate the development of powerful, trustworthy, and general-purpose LLM-based agents. Participants will gain a clear understanding of what planning truly entails, what has been learned (and sometimes forgotten) in the shift toward LLM-based approaches, and how foundational insights from the planning community can inform the creation of stronger, more reliable, and more scalable LLM-powered planners. Show more View full details Tutorial Foundations of Tensor/Low-Rank Computations for AI Grigorios Chrysos · Evrim Acar · Antonio Vergari Dec 2, 9:30 AM - 12:00 PM Upper Level Room 28A-E View full details Tutorial Explain AI Models: Methods and Opportunities in Explainable AI, Data-Centric AI, and Mechanistic Interpretability Shichang (Ray) Zhang · Himabindu Lakkaraju · Julius Adebayo Dec 2, 9:30 AM - 12:00 PM Exhibit Hall G,H Understanding AI system behavior has become critical for safety, trust, and effective deployment across diverse applications. Three major research communities have emerged to address this challenge through interpretability methods: Explainable AI focuses on feature attribution to understand which input features drive model decisions; Data-Centric AI emphasizes data attribution to analyze how training examples shape model behavior; and Mechanistic Interpretability examines component attribution to understand how internal model components contribute to outputs. These three branches share the goal of better understanding AI systems across different aspects and differ primarily in their perspectives rather than techniques. This tutorial begins with foundational concepts and historical context, providing essential background on why explainability matters and how the field has evolved since its early days. The first technical deep dive covers post hoc explanation methods, data-centric explanation techniques, mechanistic interpretability approaches, and presents a unified framework demonstrating that these methods share fundamental techniques such as perturbations, gradients, and local linear approximations. The second technical deep dive explores inherently interpretable models, clarifying concepts like reasoning (chain-of-thought) LLMs and self-explanatory LLMs in the context of explainability, and techniques for building inherently interpretable LLMs. We also showcase open source tools that make these methods accessible to practitioners. Furthermore, we highlight promising future research directions in interpretability research and the induced future directions in AI more broadly, with applications in model editing, steering, and regulation. Through comprehensive coverage of algorithms, real-world case studies, and practical guidance, attendees will gain both a deep technical understanding of state-of-the-art methods and practical skills to apply interpretability techniques effectively in AI applications. Show more View full details Tutorial Theoretical Insights on Training Instability in Deep Learning Jingfeng Wu · Yu-Xiang Wang · Maryam Fazel Dec 2, 1:30 PM - 4:00 PM Upper Level Ballroom 6AB The advances in deep learning build on the dark arts of gradient-based optimization. In deep learning, the optimization process is oscillatory, spiky, and unstable. This makes little sense in classical optimization theory, which primarily operates in a well-behaved, stable regime. Yet, the best training configuration in practice constantly operates in an unstable regime. This tutorial introduces recent theoretical progress in understanding the benign nature of training instabilities, providing new insights from both optimization and statistical learning perspectives. Show more View full details Tutorial Autoregressive Models Beyond Language Tianhong Li · Huiwen Chang · Kaiming He Dec 2, 1:30 PM - 4:00 PM Upper Level Room 30A-E Autoregressive modeling is no longer confined to language. Recent work shows that the same next-element prediction principle can achieve state-of-the-art performance in generative modeling, representation learning, and multi-modal tasks across images, video, audio, robotics, and scientific data. Yet, extending autoregressive methods to these data is far from straightforward. Many inductive biases used in autoregressive language models no longer hold for other data modalities, and thus, many new techniques have been proposed in recent years to adapt autoregressive models to data beyond language. This tutorial will review the core theory of autoregressive models, present practical design choices for generative modeling, representation learning, and multi-modal learning, and spotlight open challenges in this area. We hope our tutorial can provide the attendees with a clear conceptual roadmap and hands-on resources to apply and extend autoregressive techniques across diverse data domains. Show more View full details Tutorial Data Privacy, Memorization, & Legal Implications in Generative AI: A Practical Guide Pratyush Maini · Joseph C. Gratz · A. Feder Cooper Dec 2, 1:30 PM - 4:00 PM Exhibit Hall F View full details Tutorial The Science of Benchmarking: What’s Measured, What’s Missed, and What’s Next Ziqiao Ma · Michael Saxon · Xiang Yue Dec 2, 1:30 PM - 4:00 PM Exhibit Hall G,H View full details Tutorial Foundations of Imitation Learning: From Language Modeling to Continuous Control Adam Block · Dylan Foster · Max Simchowitz Dec 2, 1:30 PM - 4:00 PM Upper Level Ballroom 20AB View full details Tutorial Scale Test-Time Compute on Modern Hardware Zhuoming Chen · Beidi Chen · Azalia Mirhoseini Dec 2, 1:30 PM - 4:00 PM Upper Level Ballroom 6CDEF Large language models have achieved significant breakthroughs in reasoning tasks, relying on the effective use of test-time compute. Techniques such as chain-of-thought and sampling-based strategies have shown that increasing test-time computation can dramatically enhance model performance. Our recent scaling law analyses highlight the critical role of test-time compute in enabling advanced reasoning, beyond what pretraining can offer. We also provide a practical analysis of hardware efficiency, revealing where bottlenecks arise and how they differ fundamentally from those in pretraining. Scaling test-time compute on modern hardware presents unique challenges. Compared to training workloads, test-time compute often exhibits low parallelism, irregular workload, frequent memory I/O, and dynamic execution paths, all of which make efficient deployment difficult. Therefore, practical scalability is often bottlenecked by system constraints, such as attention-related memory overheads and limited compute utilization. To address these challenges, the community has explored solutions across both systems and algorithms. On the system side, advancements include memory-efficient key-value cache management, optimized attention kernels, and scheduling mechanisms for adaptive resource allocation. On the algorithm side, emerging work has proposed model architectures and parallel generation paradigms that better align with hardware. This tutorial aims to provide a comprehensive overview of the landscape of scalable test-time compute. We will cover foundational challenges, review recent progress from both system and algorithm perspectives, and discuss principles for building solutions that are truly compatible with modern hardware. By bridging theory with deployment realities, we hope this tutorial will inspire and accelerate the development of practical, scalable LLM agent systems. Show more View full details Tutorial Recent Developments in Geometric Machine Learning: Foundations, Models, and More Behrooz Tahmasebi · Stefanie Jegelka Dec 2, 1:30 PM - 4:00 PM Upper Level Room 28A-E View full details No Events Found Try adjusting your search terms Successful Page Load NeurIPS uses cookies for essential functions only. We do not sell your personal information. Our Privacy Policy » Accept

Executive Summary

This article presents an overview of four tutorials at an AI conference, focusing on the topics of hyperparameter optimization, model merging, human-AI alignment, and planning in the era of language models. The tutorials cover various aspects of machine learning, including the importance of hyperparameter optimization, new approaches in large language models, and the integration of structure-aware principled approaches with current techniques. The article highlights the challenges and limitations of current methods and identifies future research areas, such as integration with current techniques, better optimization in high-dimensional and discrete spaces, and improving scalability in distributed settings. The tutorials aim to provide a comprehensive understanding of the current state of machine learning and its applications, as well as the future directions of research in this field.

Key Points

  • The importance of hyperparameter optimization in machine learning
  • New approaches in large language models, including neural scaling laws and parameterization-aware methods
  • The integration of structure-aware principled approaches with current techniques

Merits

Strength in Surveying Current Techniques

The article provides an overview of common techniques used in practice for hyperparameter optimization, including Bayesian optimization and bandit-based approaches, as well as new approaches developed in the context of Large Language Models.

In-Depth Analysis of Challenges and Limitations

The article discusses the chief advantages and shortcomings of current approaches, including their limited theoretical guarantees, and identifies future research areas.

Demerits

Limited Theoretical Guarantees

Current approaches have limited theoretical guarantees, which may hinder their adoption in critical applications.

Scalability Issues

The article highlights the need for improving scalability in distributed settings, which may be a significant challenge in large-scale applications.

Expert Commentary

The article provides a comprehensive overview of the current state of machine learning and its applications. However, it also highlights the challenges and limitations of current methods, which may hinder their adoption in critical applications. The article identifies future research areas, including integration with current techniques, better optimization in high-dimensional and discrete spaces, and improving scalability in distributed settings. These areas are crucial for the development of more robust and reliable machine learning systems.

Recommendations

  • Invest in research and development of structure-aware principled approaches to hyperparameter optimization.
  • Explore new techniques for improving scalability in distributed settings and better optimization in high-dimensional and discrete spaces.

Sources

Related Articles