Sink-Aware Pruning for Diffusion Language Models
arXiv:2602.17664v1 Announce Type: new Abstract: Diffusion Language Models (DLMs) incur high inference cost due to iterative denoising, motivating efficient pruning. Existing pruning heuristics largely inherited …
Aidar Myrzakhan, Tianyi Li, Bowei Guo, Shengkun Tang, Zhiqiang Shen
4 views