Pyramid MoA: A Probabilistic Framework for Cost-Optimized Anytime Inference
arXiv:2602.19509v1 Announce Type: new Abstract: Large Language Models (LLMs) face a persistent trade-off between inference cost and reasoning capability. While "Oracle" models (e.g., Llama-3-70B) achieve …
Arindam Khaled
4 views