Academic

Sinkhorn-Drifting Generative Models

arXiv:2603.12366v1 Announce Type: new Abstract: We establish a theoretical link between the recently proposed "drifting" generative dynamics and gradient flows induced by the Sinkhorn divergence. In a particle discretization, the drift field admits a cross-minus-self decomposition: an attractive term toward the target distribution and a repulsive/self-correction term toward the current model, both expressed via one-sided normalized Gibbs kernels. We show that Sinkhorn divergence yields an analogous cross-minus-self structure, but with each term defined by entropic optimal-transport couplings obtained through two-sided Sinkhorn scaling (i.e., enforcing both marginals). This provides a precise sense in which drifting acts as a surrogate for a Sinkhorn-divergence gradient flow, interpolating between one-sided normalization and full two-sided Sinkhorn scaling. Crucially, this connection resolves an identifiability gap in prior drifting formulations: leveraging the definiteness of the Sink

arXiv:2603.12366v1 Announce Type: new Abstract: We establish a theoretical link between the recently proposed "drifting" generative dynamics and gradient flows induced by the Sinkhorn divergence. In a particle discretization, the drift field admits a cross-minus-self decomposition: an attractive term toward the target distribution and a repulsive/self-correction term toward the current model, both expressed via one-sided normalized Gibbs kernels. We show that Sinkhorn divergence yields an analogous cross-minus-self structure, but with each term defined by entropic optimal-transport couplings obtained through two-sided Sinkhorn scaling (i.e., enforcing both marginals). This provides a precise sense in which drifting acts as a surrogate for a Sinkhorn-divergence gradient flow, interpolating between one-sided normalization and full two-sided Sinkhorn scaling. Crucially, this connection resolves an identifiability gap in prior drifting formulations: leveraging the definiteness of the Sinkhorn divergence, we show that zero drift (equilibrium of the dynamics) implies that the model and target measures match. Experiments show that Sinkhorn drifting reduces sensitivity to kernel temperature and improves one-step generative quality, trading off additional training time for a more stable optimization, without altering the inference procedure used by drift methods. These theoretical gains translate to strong low-temperature improvements in practice: on FFHQ-ALAE at the lowest temperature setting we evaluate, Sinkhorn drifting reduces mean FID from 187.7 to 37.1 and mean latent EMD from 453.3 to 144.4, while on MNIST it preserves full class coverage across the temperature sweep. Project page: https://mint-vu.github.io/SinkhornDrifting/

Executive Summary

This article establishes a theoretical link between 'drifting' generative dynamics and gradient flows induced by the Sinkhorn divergence. The authors show that Sinkhorn divergence yields a cross-minus-self structure, providing a precise sense in which drifting acts as a surrogate for a Sinkhorn-divergence gradient flow. This connection resolves an identifiability gap in prior drifting formulations and leverages the definiteness of the Sinkhorn divergence to show that zero drift implies that the model and target measures match. Experiments demonstrate that Sinkhorn drifting improves one-step generative quality and reduces sensitivity to kernel temperature, without altering the inference procedure.

Key Points

  • Establishes a theoretical link between 'drifting' generative dynamics and Sinkhorn divergence.
  • Demonstrates that Sinkhorn divergence yields a cross-minus-self structure.
  • Resolves an identifiability gap in prior drifting formulations.
  • Improves one-step generative quality and reduces sensitivity to kernel temperature.

Merits

Strength in Theoretical Foundations

The article provides a solid theoretical foundation for understanding the relationship between drifting generative dynamics and Sinkhorn divergence, shedding light on the identifiability gap in prior formulations.

Empirical Improvements

Experiments demonstrate that Sinkhorn drifting improves generative quality and reduces sensitivity to kernel temperature, showcasing its practical applicability.

Demerits

Limited Scope

The article focuses primarily on the theoretical link between drifting generative dynamics and Sinkhorn divergence, with limited exploration of potential applications and extensions to other generative models.

Complexity of Methodology

The Sinkhorn drifting approach may require significant computational resources and expertise in optimal transport theory, potentially limiting its adoption in practice.

Expert Commentary

This article makes a significant contribution to the field of generative modeling by establishing a theoretical link between drifting generative dynamics and Sinkhorn divergence. The authors' approach provides a deeper understanding of the underlying mechanisms driving the behavior of generative models and opens up new avenues for improving their stability and convergence. While the article's focus on the theoretical foundations of Sinkhorn drifting may limit its immediate practical impact, the experimental results demonstrate the potential of this approach to yield significant improvements in generative quality and stability. As the field of generative modeling continues to evolve, it is essential to build on the foundations established in this article and explore its extensions to other generative models and applications.

Recommendations

  • Future research should focus on exploring the applications of Sinkhorn drifting to other generative models and domains, such as time-series and graph generation.
  • Developing more efficient and computationally feasible algorithms for Sinkhorn drifting will be crucial for its adoption in practical applications.

Sources