OptiML: An End-to-End Framework for Program Synthesis and CUDA Kernel Optimization
arXiv:2602.12305v1 Announce Type: cross Abstract: Generating high-performance CUDA kernels remains challenging due to the need to navigate a combinatorial space of low-level transformations under noisy …
Arijit Bhattacharjee, Heng Ping, Son Vu Le, Paul Bogdan, Nesreen K. Ahmed, Ali Jannesari
49 views