Momentum Guidance: Plug-and-Play Guidance for Flow Models
arXiv:2602.20360v1 Announce Type: new Abstract: Flow-based generative models have become a strong framework for high-quality generative modeling, yet pretrained models are rarely used in their vanilla conditional form: conditional samples without guidance often appear diffuse and lack fine-grained detail due to the smoothing effects of neural networks. Existing guidance techniques such as classifier-free guidance (CFG) improve fidelity but double the inference cost and typically reduce sample diversity. We introduce Momentum Guidance (MG), a new dimension of guidance that leverages the ODE trajectory itself. MG extrapolates the current velocity using an exponential moving average of past velocities and preserves the standard one-evaluation-per-step cost. It matches the effect of standard guidance without extra computation and can further improve quality when combined with CFG. Experiments demonstrate MG's effectiveness across benchmarks. Specifically, on ImageNet-256, MG achieves aver
arXiv:2602.20360v1 Announce Type: new Abstract: Flow-based generative models have become a strong framework for high-quality generative modeling, yet pretrained models are rarely used in their vanilla conditional form: conditional samples without guidance often appear diffuse and lack fine-grained detail due to the smoothing effects of neural networks. Existing guidance techniques such as classifier-free guidance (CFG) improve fidelity but double the inference cost and typically reduce sample diversity. We introduce Momentum Guidance (MG), a new dimension of guidance that leverages the ODE trajectory itself. MG extrapolates the current velocity using an exponential moving average of past velocities and preserves the standard one-evaluation-per-step cost. It matches the effect of standard guidance without extra computation and can further improve quality when combined with CFG. Experiments demonstrate MG's effectiveness across benchmarks. Specifically, on ImageNet-256, MG achieves average improvements in FID of 36.68% without CFG and 25.52% with CFG across various sampling settings, attaining an FID of 1.597 at 64 sampling steps. Evaluations on large flow-based models like Stable Diffusion 3 and FLUX.1-dev further confirm consistent quality enhancements across standard metrics.
Executive Summary
This article introduces Momentum Guidance (MG), a novel technique for improving the quality of flow-based generative models. MG leverages the ODE trajectory itself, extrapolating the current velocity using an exponential moving average of past velocities. This approach preserves the standard one-evaluation-per-step cost, making it a computationally efficient alternative to existing guidance techniques. Experiments demonstrate MG's effectiveness across various benchmarks, achieving average improvements in FID of 36.68% without classifier-free guidance (CFG) and 25.52% with CFG. The proposed method can be particularly useful in scenarios where high-quality generative modeling is crucial, such as image synthesis and data augmentation. By combining MG with CFG, researchers can potentially achieve even better results, further improving the fidelity and diversity of generated samples.
Key Points
- ▸ Momentum Guidance (MG) is a novel technique for improving the quality of flow-based generative models
- ▸ MG leverages the ODE trajectory itself, preserving the standard one-evaluation-per-step cost
- ▸ Experiments demonstrate MG's effectiveness across various benchmarks, achieving significant improvements in FID
Merits
Improved Quality and Diversity
MG achieves average improvements in FID of 36.68% without CFG and 25.52% with CFG, demonstrating its effectiveness in generating high-quality samples
Computational Efficiency
MG preserves the standard one-evaluation-per-step cost, making it a computationally efficient alternative to existing guidance techniques
Flexibility and Scalability
MG can be combined with existing guidance techniques, such as CFG, to potentially achieve even better results
Demerits
Limited Evaluation Metrics
The article primarily focuses on FID scores, which may not be sufficient to evaluate the quality and diversity of generated samples in all scenarios
Lack of Theoretical Foundations
The article does not provide a deep theoretical analysis of MG, which may limit its applicability and generalizability
Expert Commentary
The introduction of MG is a significant contribution to the field of flow-based generative models. By leveraging the ODE trajectory itself, MG provides a novel and computationally efficient approach to improving the quality and diversity of generated samples. However, the article could benefit from a deeper theoretical analysis of MG, which would provide further insight into its applicability and generalizability. Additionally, the evaluation metrics used in the article may not be sufficient to evaluate the quality and diversity of generated samples in all scenarios. Nevertheless, MG has the potential to be a game-changer in the field of generative modeling, and its development highlights the importance of continued research in improving the quality and diversity of generated samples.
Recommendations
- ✓ Further research is needed to develop a deeper theoretical understanding of MG and its applicability in various scenarios
- ✓ Researchers should explore the use of MG in combination with other guidance techniques to achieve even better results