Rooted Absorbed Prefix Trajectory Balance with Submodular Replay for GFlowNet Training
arXiv:2603.00454v1 Announce Type: new Abstract: Generative Flow Networks (GFlowNets) enable fine-tuning large language models to approximate reward-proportional posteriors, but they remain prone to mode collapse, …
Xi Wang, Wenbo Lu, Shengjie Wang
18 views