All Articles

Articles

Academic · 1 min

Does a Global Perspective Help Prune Sparse MoEs Elegantly?

arXiv:2604.06542v1 Announce Type: new Abstract: Empirical scaling laws for language models have encouraged the development of ever-larger LLMs, despite their growing computational and memory costs. …

Zeliang Zhang, Nikhil Ghosh, Jiani Liu, Bin Yu, Xiaodong Liu
7 views
Academic · 1 min

The Rhetoric of Machine Learning

arXiv:2604.06754v1 Announce Type: new Abstract: I examine the technology of machine learning from the perspective of rhetoric, which is simply the art of persuasion. Rather …

Robert C. Williamson
50 views
Academic · 1 min

DiffuMask: Diffusion Language Model for Token-level Prompt Pruning

arXiv:2604.06627v1 Announce Type: new Abstract: In-Context Learning and Chain-of-Thought prompting improve reasoning in large language models (LLMs). These typically come at the cost of longer, …

Caleb Zheng, Jyotika Singh, Fang Tu, Weiyi Sun, Sujeeth Bharadwaj, Yassine Benajiba, Sujith Ravi, Eli Shlizerman, Dan Roth
19 views