Robotics & Physical AI Model Embodied & GUI Experimental

Rho-Alpha

Robotics VLA+ Model from Phi

Try Rho-Alpha on Microsoft Foundry → Try on Microsoft Foundry →

About Rho-Alpha

Rho-Alpha (ρα) is the first robotics vision-language model derived from Microsoft’s Phi series. It translates natural-language commands into control signals for bimanual robotic manipulation, integrating tactile sensing alongside vision and inheriting the efficiency and grounding characteristics of the Phi vision-language backbone. Crucially, the model learns continually from human teleoperation feedback during deployment, allowing the robot to adapt to new environments and tasks without an explicit retraining cycle.

Rho-Alpha addresses the central pain point of physical robotics: task-specific programming dominates development cost, and policies trained in simulation often degrade in the real world. By combining language understanding with multimodal sensing (vision, proprioception, touch) and continuous teleoperation-driven learning, the model gives deployed systems a path to improve over time rather than ship frozen. As Microsoft’s first Phi-derived robotics model, it signals a deliberate move from purely digital agents into Physical AI, where the same family of small, efficient models drives both screens and arms.

Key capabilities

VLA+ with tactile sensing and online learning from corrections
First robotics model derived from Microsoft's Phi VLM series
Translates natural language directly into bimanual control signals
Continual learning from human teleoperation during deployment
Early-access research model for physical-AI experimentation

Technology Stack

Phi VLM backbone Robotics simulation

Technology Stack

Phi VLM backbone Robotics simulation

Ready to Explore?

Dive into platform integrations, source code, research papers, and announcements.

PLATFORM Microsoft Foundry Try Rho-Alpha in the Microsoft Foundry model catalog. EXPLORE ON FOUNDRY BLOG Microsoft Blog See the latest updates from Microsoft Research. VISIT BLOG