Synthetic Mixed Training: Scaling Parametric Knowledge Acquisition Beyond RAG
arXiv:2603.23562v1 Announce Type: new Abstract: Synthetic data augmentation helps language models learn new knowledge in data-constrained domains. However, naively scaling existing synthetic data methods by …
Seungju Han, Konwoo Kim, Chanwoo Park, Benjamin Newman, Suhas Kotha, Jaehun Jung, James Zou, Yejin Choi
7 views