Academic

One-Step Flow Policy: Self-Distillation for Fast Visuomotor Policies

arXiv:2603.12480v1 Announce Type: cross Abstract: Generative flow and diffusion models provide the continuous, multimodal action distributions needed for high-precision robotic policies. However, their reliance on iterative sampling introduces severe inference latency, degrading control frequency and harming performance in time-sensitive manipulation. To address this problem, we propose the One-Step Flow Policy (OFP), a from-scratch self-distillation framework for high-fidelity, single-step action generation without a pre-trained teacher. OFP unifies a self-consistency loss to enforce coherent transport across time intervals, and a self-guided regularization to sharpen predictions toward high-density expert modes. In addition, a warm-start mechanism leverages temporal action correlations to minimize the generative transport distance. Evaluations across 56 diverse simulated manipulation tasks demonstrate that a one-step OFP achieves state-of-the-art results, outperforming 100-step diff

Shaolong Li, Lichao Sun, Yongchao Chen · March 16, 2026 · 1 min read · 12 views

#cs.RO #cs.AI

Executive Summary

The article proposes the One-Step Flow Policy (OFP), a novel self-distillation framework for high-fidelity, single-step action generation in robotic policies. OFP leverages a self-consistency loss, self-guided regularization, and a warm-start mechanism to achieve state-of-the-art results in 56 diverse simulated manipulation tasks, outperforming 100-step diffusion and flow policies while accelerating action generation by over 100 times. The authors demonstrate the scalability of OFP by integrating it into the π0.5 model on RoboTwin 2.0, where one-step OFP surpasses the original 10-step policy. The proposed method establishes OFP as a practical, scalable solution for highly accurate and low-latency robot control.

Key Points

▸ The One-Step Flow Policy (OFP) is a novel self-distillation framework for high-fidelity, single-step action generation in robotic policies.
▸ OFP leverages a self-consistency loss, self-guided regularization, and a warm-start mechanism to achieve state-of-the-art results.
▸ The authors demonstrate the scalability of OFP in 56 diverse simulated manipulation tasks and integrate it into the π0.5 model on RoboTwin 2.0.

Merits

Strength

The proposed method achieves state-of-the-art results in simulated manipulation tasks, outperforming 100-step diffusion and flow policies while accelerating action generation by over 100 times.

Demerits

Limitation

The article primarily focuses on simulated manipulation tasks and lacks real-world experiments to validate the scalability and robustness of the proposed method.

Expert Commentary

The article presents a significant contribution to the field of robotic control, proposing a novel self-distillation framework that achieves state-of-the-art results in simulated manipulation tasks. While the authors demonstrate the scalability of the proposed method, it is essential to conduct real-world experiments to validate its robustness and robustness in diverse scenarios. The proposed method has the potential to revolutionize the field of robotic control, enabling fast, accurate, and low-latency action generation in real-world scenarios. However, further research is needed to address potential limitations and challenges associated with the proposed method.

Recommendations

✓ Future research should focus on conducting real-world experiments to validate the scalability and robustness of the proposed method.
✓ The authors should explore the application of the proposed method in various domains, such as healthcare, manufacturing, and logistics, to demonstrate its practical implications.

Sources

arXiv - cs.AI

One-Step Flow Policy: Self-Distillation for Fast Visuomotor Policies

AI Commentary

Executive Summary

Key Points

Merits

Strength

Demerits

Limitation

Expert Commentary

Recommendations

Sources

Related Articles

AI-Driven Approaches to Enhancing Fairness and Identifying Algorithmic Bias in …

High resolution schemes for hyperbolic conservation laws

Robust Graph Representation Learning via Adaptive Spectral Contrast

Towards Intrinsically Calibrated Uncertainty Quantification in Industrial Data-Driven Models via …

JCG, PC

HSOLLC Co., Ltd.