On Representation Redundancy in Large-Scale Instruction Tuning Data Selection
arXiv:2602.13773v1 Announce Type: new Abstract: Data quality is a crucial factor in large language models training. While prior work has shown that models trained on …
Youwei Shu, Shaomian Zheng, Dingnan Jin, Wenjie Qu, Ziyao Guo, Qing Cui, Jun Zhou, Jiaheng Zhang
14 views