All Articles

Articles

Academic · 1 min

LLM Routing as Reasoning: A MaxSAT View

arXiv:2603.13612v1 Announce Type: new Abstract: Routing a query through an appropriate LLM is challenging, particularly when user preferences are expressed in natural language and model …

Son Nguyen, Xinyuan Liu, Ransalu Senanayake
12 views
Academic · 1 min

The Phenomenology of Hallucinations

arXiv:2603.13911v1 Announce Type: new Abstract: We show that language models hallucinate not because they fail to detect uncertainty, but because of a failure to integrate …

Valeria Ruscio, Keiran Thompson
5 views
Academic · 1 min

Widespread Gender and Pronoun Bias in Moral Judgments Across LLMs

arXiv:2603.13636v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used to assess moral or ethical statements, yet their judgments may reflect social and …

Gustavo L\'ucius Fernandes, Jeiverson C. V. M. Santos, Pedro O. S. Vaz-de-Melo
22 views
Academic · 1 min

QuarkMedBench: A Real-World Scenario Driven Benchmark for Evaluating Large Language Models

arXiv:2603.13691v1 Announce Type: new Abstract: While Large Language Models (LLMs) excel on standardized medical exams, high scores often fail to translate to high-quality responses for …

Yao Wu, Kangping Yin, Liang Dong, Zhenxin Ma, Shuting Xu, Xuehai Wang, Yuxuan Jiang, Tingting Yu, Yunqing Hong, Jiayi Liu, Rianzhe Huang, Shuxin Zhao, Haiping Hu, Wen Shang, Jian Xu, Guanjun Jiang
20 views