Foundation models have made great advances in robotics, enabling the creation of vision-language-action (VLA) models that generalize to objects, scenes, and tasks beyond their training data. However, ...