Academic

Transport Clustering: Solving Low-Rank Optimal Transport via Clustering

arXiv:2603.03578v1 Announce Type: new Abstract: Optimal transport (OT) finds a least cost transport plan between two probability distributions using a cost matrix defined on pairs of points. Unlike standard OT, which infers unstructured pointwise mappings, low-rank optimal transport explicitly constrains the rank of the transport plan to infer latent structure. This improves statistical stability and robustness, yields sharper parametric rates for estimating Wasserstein distances adaptive to the intrinsic rank, and generalizes $K$-means to co-clustering. These advantages, however, come at the cost of a non-convex and NP-hard optimization problem. We introduce transport clustering, an algorithm to compute a low-rank OT plan that reduces low-rank OT to a clustering problem on correspondences obtained from a full-rank $\textit{transport registration}$ step. We prove that this reduction yields polynomial-time, constant-factor approximation algorithms for low-rank OT: specifically, a $(1+\

H
Henri Schmidt, Peter Halmos, Ben Raphael
· · 1 min read · 9 views

arXiv:2603.03578v1 Announce Type: new Abstract: Optimal transport (OT) finds a least cost transport plan between two probability distributions using a cost matrix defined on pairs of points. Unlike standard OT, which infers unstructured pointwise mappings, low-rank optimal transport explicitly constrains the rank of the transport plan to infer latent structure. This improves statistical stability and robustness, yields sharper parametric rates for estimating Wasserstein distances adaptive to the intrinsic rank, and generalizes $K$-means to co-clustering. These advantages, however, come at the cost of a non-convex and NP-hard optimization problem. We introduce transport clustering, an algorithm to compute a low-rank OT plan that reduces low-rank OT to a clustering problem on correspondences obtained from a full-rank $\textit{transport registration}$ step. We prove that this reduction yields polynomial-time, constant-factor approximation algorithms for low-rank OT: specifically, a $(1+\gamma)$ approximation for negative-type metrics and a $(1+\gamma+\sqrt{2\gamma}\,)$ approximation for kernel costs, where $\gamma \in [0,1]$ denotes the approximation ratio of the optimal full-rank solution relative to the low-rank optimal. Empirically, transport clustering outperforms existing low-rank OT solvers on synthetic benchmarks and large-scale, high-dimensional datasets.

Executive Summary

This article introduces transport clustering, a novel algorithm for solving low-rank optimal transport (OT) problems. By reducing the low-rank OT problem to a clustering problem, transport clustering yields polynomial-time, constant-factor approximation algorithms. The authors prove that this reduction produces approximation guarantees for negative-type metrics and kernel costs. Empirical results demonstrate the superiority of transport clustering over existing low-rank OT solvers on synthetic and real-world datasets. This breakthrough has significant implications for a wide range of applications, including machine learning, computer vision, and data analysis. The algorithm's efficiency and accuracy make it a promising tool for large-scale, high-dimensional data processing.

Key Points

  • Transport clustering reduces low-rank OT to a clustering problem on correspondences obtained from a full-rank transport registration step.
  • The algorithm yields polynomial-time, constant-factor approximation algorithms for low-rank OT.
  • Transport clustering outperforms existing low-rank OT solvers on synthetic and real-world datasets.

Merits

Improves Statistical Stability and Robustness

By constraining the rank of the transport plan, transport clustering improves the statistical stability and robustness of low-rank OT solutions.

Polynomial-Time Complexity

The algorithm's reduction of the low-rank OT problem to a clustering problem yields polynomial-time complexity, making it more efficient than existing solvers.

Demerits

Non-Convex Optimization Problem

The low-rank OT problem remains non-convex and NP-hard, which may limit the algorithm's applicability in certain scenarios.

Expert Commentary

The article's contribution is significant, as it addresses the long-standing challenge of solving low-rank OT problems efficiently and accurately. The transport clustering algorithm offers a promising solution, leveraging the power of clustering algorithms to reduce the computational complexity of the low-rank OT problem. However, further research is needed to explore the algorithm's limitations and potential applications in various fields. Additionally, the article's empirical results are encouraging, but more comprehensive evaluations on diverse datasets would strengthen the algorithm's claims.

Recommendations

  • Future research should focus on exploring the algorithm's limitations and potential applications in various fields.
  • More comprehensive evaluations on diverse datasets would strengthen the algorithm's claims and provide a more nuanced understanding of its performance.

Sources