LLMs for High-Frequency Decision-Making: Normalized Action Reward-Guided Consistency Policy Optimization
arXiv:2603.02680v1 Announce Type: new Abstract: While Large Language Models (LLMs) form the cornerstone of sequential decision-making agent development, they have inherent limitations in high-frequency decision …
Yang Zhao, Zihao Li, Zhiyu Jiang, Dandan Ma, Ganchao Liu, Wenzhe Zhao
3 views