LFQA-HP-1M: A Large-Scale Human Preference Dataset for Long-Form Question Answering
arXiv:2602.23603v1 Announce Type: new Abstract: Long-form question answering (LFQA) demands nuanced evaluation of multi-sentence explanatory responses, yet existing metrics often fail to reflect human judgment. …
Rafid Ishrak Jahan, Fahmid Shahriar Iqbal, Sagnik Ray Choudhury
1 views