Stochastic Gradient Descent in the Saddle-to-Saddle Regime of Deep Linear Networks
arXiv:2604.06366v1 Announce Type: new Abstract: Deep linear networks (DLNs) are used as an analytically tractable model of the training dynamics of deep neural networks. While …
Guillaume Corlouer, Avi Semler, Alexander Strang, Alexander Gietelink Oldenziel
20 views