EvalSense: A Framework for Domain-Specific LLM (Meta-)Evaluation
arXiv:2602.18823v1 Announce Type: new Abstract: Robust and comprehensive evaluation of large language models (LLMs) is essential for identifying effective LLM system configurations and mitigating risks …
Adam Dejl, Jonathan Pearson
3 views