Skip to main content
Academic

Stability and Generalization of Push-Sum Based Decentralized Optimization over Directed Graphs

arXiv:2602.20567v1 Announce Type: new Abstract: Push-Sum-based decentralized learning enables optimization over directed communication networks, where information exchange may be asymmetric. While convergence properties of such methods are well understood, their finite-iteration stability and generalization behavior remain unclear due to structural bias induced by column-stochastic mixing and asymmetric error propagation. In this work, we develop a unified uniform-stability framework for the Stochastic Gradient Push (SGP) algorithm that captures the effect of directed topology. A key technical ingredient is an imbalance-aware consistency bound for Push-Sum, which controls consensus deviation through two quantities: the stationary distribution imbalance parameter $\delta$ and the spectral gap $(1-\lambda)$ governing mixing speed. This decomposition enables us to disentangle statistical effects from topology-induced bias. We establish finite-iteration stability and optimization guarante

Y
Yifei Liang, Yan Sun, Xiaochun Cao, Li Shen
· · 1 min read · 5 views

arXiv:2602.20567v1 Announce Type: new Abstract: Push-Sum-based decentralized learning enables optimization over directed communication networks, where information exchange may be asymmetric. While convergence properties of such methods are well understood, their finite-iteration stability and generalization behavior remain unclear due to structural bias induced by column-stochastic mixing and asymmetric error propagation. In this work, we develop a unified uniform-stability framework for the Stochastic Gradient Push (SGP) algorithm that captures the effect of directed topology. A key technical ingredient is an imbalance-aware consistency bound for Push-Sum, which controls consensus deviation through two quantities: the stationary distribution imbalance parameter $\delta$ and the spectral gap $(1-\lambda)$ governing mixing speed. This decomposition enables us to disentangle statistical effects from topology-induced bias. We establish finite-iteration stability and optimization guarantees for both convex objectives and non-convex objectives satisfying the Polyak--\L{}ojasiewicz condition. For convex problems, SGP attains excess generalization error of order $\tilde{\mathcal{O}}\!\left(\frac{1}{\sqrt{mn}}+\frac{\gamma}{\delta(1-\lambda)}+\gamma\right)$ under step-size schedules, and we characterize the corresponding optimal early stopping time that minimizes this bound. For P\L{} objectives, we obtain convex-like optimization and generalization rates with dominant dependence proportional to $\kappa\!\left(1+\frac{1}{\delta(1-\lambda)}\right)$, revealing a multiplicative coupling between problem conditioning and directed communication topology. Our analysis clarifies when Push-Sum correction is necessary compared with standard decentralized SGD and quantifies how imbalance and mixing jointly shape the best attainable learning performance.

Executive Summary

This article analyzes the stability and generalization of Push-Sum based decentralized optimization over directed graphs, providing a unified framework for understanding the effects of directed topology on optimization and generalization. The authors establish finite-iteration stability and optimization guarantees for both convex and non-convex objectives, and characterize the optimal early stopping time for convex problems. The results reveal a multiplicative coupling between problem conditioning and directed communication topology, shedding light on when Push-Sum correction is necessary and how imbalance and mixing jointly shape learning performance.

Key Points

  • Development of a unified uniform-stability framework for Stochastic Gradient Push (SGP) algorithm
  • Establishment of finite-iteration stability and optimization guarantees for convex and non-convex objectives
  • Characterization of the optimal early stopping time for convex problems

Merits

Theoretical Rigor

The article provides a rigorous and comprehensive analysis of the Push-Sum based decentralized optimization, establishing a unified framework for understanding the effects of directed topology.

Demerits

Limited Practical Application

The article focuses primarily on theoretical analysis, with limited discussion on practical applications and implementation details.

Expert Commentary

The article makes a significant contribution to the field of decentralized optimization, providing a thorough analysis of the Push-Sum based approach. The authors' use of a unified uniform-stability framework allows for a nuanced understanding of the effects of directed topology on optimization and generalization. The results have important implications for the design of decentralized optimization algorithms, particularly in scenarios where communication networks are asymmetric. However, further research is needed to explore the practical applications and implementation details of the proposed approach.

Recommendations

  • Future research should focus on exploring the practical applications and implementation details of the Push-Sum based decentralized optimization approach.
  • The development of more efficient decentralized optimization algorithms that can handle asymmetric communication networks is an important area of future research.

Sources