Academic

Pretrained Vision-Language-Action Models are Surprisingly Resistant to Forgetting in Continual Learning

arXiv:2603.03818v1 Announce Type: new Abstract: Continual learning is a long-standing challenge in robot policy learning, where a policy must acquire new skills over time without catastrophically forgetting previously learned ones. While prior work has extensively studied continual learning in relatively small behavior cloning (BC) policy models trained from scratch, its behavior in modern large-scale pretrained Vision-Language-Action (VLA) models remains underexplored. In this work, we found that pretrained VLAs are remarkably resistant to forgetting compared with smaller policy models trained from scratch. Simple Experience Replay (ER) works surprisingly well on VLAs, sometimes achieving zero forgetting even with a small replay data size. Our analysis reveals that pretraining plays a critical role in downstream continual learning performance: large pretrained models mitigate forgetting with a small replay buffer size while maintaining strong forward learning capabilities. Furthermor

H
Huihan Liu, Changyeon Kim, Bo Liu, Minghuan Liu, Yuke Zhu
· · 1 min read · 9 views

arXiv:2603.03818v1 Announce Type: new Abstract: Continual learning is a long-standing challenge in robot policy learning, where a policy must acquire new skills over time without catastrophically forgetting previously learned ones. While prior work has extensively studied continual learning in relatively small behavior cloning (BC) policy models trained from scratch, its behavior in modern large-scale pretrained Vision-Language-Action (VLA) models remains underexplored. In this work, we found that pretrained VLAs are remarkably resistant to forgetting compared with smaller policy models trained from scratch. Simple Experience Replay (ER) works surprisingly well on VLAs, sometimes achieving zero forgetting even with a small replay data size. Our analysis reveals that pretraining plays a critical role in downstream continual learning performance: large pretrained models mitigate forgetting with a small replay buffer size while maintaining strong forward learning capabilities. Furthermore, we found that VLAs can retain relevant knowledge from prior tasks despite performance degradation during learning new tasks. This knowledge retention enables rapid recovery of seemingly forgotten skills through finetuning. Together, these insights imply that large-scale pretraining fundamentally changes the dynamics of continual learning, enabling models to continually acquire new skills over time with simple replay. Code and more information can be found at https://ut-austin-rpl.github.io/continual-vla

Executive Summary

This article explores the resistance to forgetting in continual learning of pretrained Vision-Language-Action (VLA) models. The study reveals that large-scale pretrained VLA models exhibit remarkable resistance to forgetting compared to smaller policy models trained from scratch. The authors found that simple Experience Replay (ER) works surprisingly well on VLA models, sometimes achieving zero forgetting even with a small replay data size. The research highlights the critical role of pretraining in downstream continual learning performance and demonstrates that VLA models can retain relevant knowledge from prior tasks despite performance degradation during learning new tasks.

Key Points

  • Pretrained VLA models are resistant to forgetting in continual learning
  • Simple Experience Replay (ER) works well on VLA models with a small replay data size
  • Pretraining plays a critical role in downstream continual learning performance

Merits

Improved Continual Learning

The study demonstrates that pretrained VLA models can continually acquire new skills over time with simple replay, mitigating the issue of catastrophic forgetting.

Demerits

Limited Exploration of Other Methods

The study primarily focuses on simple Experience Replay (ER) and does not extensively explore other methods for continual learning in VLA models.

Expert Commentary

The study's results are notable, as they demonstrate that large-scale pretrained VLA models can exhibit a high degree of resistance to forgetting in continual learning. This has significant implications for the development of AI systems that must learn and adapt over time. However, further research is needed to fully explore the potential of VLA models in continual learning and to address the limitations of the current study. Additionally, the study's findings highlight the importance of pretraining in downstream continual learning performance, which can inform the development of more effective training strategies for VLA models.

Recommendations

  • Further research should be conducted to explore the use of other methods for continual learning in VLA models, such as regularization techniques and online learning approaches.
  • The development of more comprehensive evaluation metrics and benchmarks for continual learning in VLA models is necessary to fully assess their performance and potential applications.

Sources