Azure AI Foundry Labs | Early-Stage AI Experiments & Prototypes

SocialReasoning-Bench

SocialReasoning-Bench is an open-source benchmark from Microsoft Research AI Frontiers that measures whether AI agents can negotiate competently and act in their user’s best interest in multi-party settings.

EXPLORE →

PRODUCTION

Microsoft Agent Framework

The open-source framework for developers and AI engineers building production agentic applications — the direct successor to Semantic Kernel and AutoGen.

EXPLORE →

Experiment

EO/OS Object Detection

A Microsoft first-party Earth Observation and Overhead Sensing model that identifies and localizes objects in satellite and aerial imagery.

EXPLORE →

PRODUCTION

Model Router

A trained language model in Microsoft Foundry that routes each prompt in real time to the most suitable underlying LLM.

EXPLORE →

PRODUCTION

MAI-Image-2-Efficient

Microsoft’s latest text-to-image model, engineered for builders who need high-quality image generation at speed and scale.

EXPLORE →

Experiment

harrier-oss-v1

Harrier-OSS-v1 is a family of open-source multilingual text embedding models from Microsoft that delivers state-of-the-art retrieval and semantic understanding across 94 languages, available in three sizes from 270M to 27B parameters.

EXPLORE →

PRODUCTION

MAI-Voice-1

MAI-Voice-1 is an advanced next generation neural text to speech (TTS) model

EXPLORE →

PRODUCTION

MAI-Transcribe-1

Speech recognition model that supports up to 25 languages

EXPLORE →

Experiment

MAI-Image-2

State-of-the-art text-to-image generation model from Microsoft AI

EXPLORE →

Experiment

VibeVoice ASR

A unified speech‑to‑text model designed to transcribe up to 60 minutes of continuous audio in a single pass.

EXPLORE →

Experiment

Phi-4-Reasoning-Vision-15B

A compact and smart open‑weight multimodal reasoning model that balances reasoning power, efficiency, and training data needs.

EXPLORE →

Experiment

BugPilot

Complex bug generation for efficient learning of software engineering skills.

EXPLORE →

Experiment

Rho-Alpha

Rho-alpha (ρα), is the first robotics model derived from Microsoft’s Phi series of vision-language models.

EXPLORE →

Experiment

OptiMind

OptiMind is a small language model designed to convert business problems described in natural language into the mathematical formulations needed by optimization software.

EXPLORE →

Experiment

GigaTIME

GigaTIME is a multimodal AI model for translating routinely available hematoxylin and eosin (H&E) pathology slides to virtual multiplex immunofluorescence (mIF) images.

EXPLORE →

Experiment

Fara-7B

Fara‑7B is an open-weight, ultra-compact agentic small language model (SLM) designed for computer use, enabling on-device automation of real-world web tasks through direct interaction with interfaces like a human would.

EXPLORE →

Experiment

Promptions

Promptions dynamically generates UI to help users steer AI responses more effectively. It’s simple, effective, and easily customizable, making it suitable for developers from individual vibe-coders to enterprise software engineers.

EXPLORE →

Experiment

RosettaFold3

RosettaFold3 (RF3) is a unified biomolecular modeling system that predicts 3D structures of proteins, nucleic acids, and small molecules within a single framework.

EXPLORE →

Experiment

MMCT-Agent

Complex visual reasoning on images and long-form videos

EXPLORE →

Experiment

RetroChimera

RetroChimera is a model that takes as input a target molecule that one wants to synthesize, encoded as a sequence of characters (using the SMILES notation), and produces several potential chemical reactions which could be used to produce that input molecule. Each reaction is represented as a group of ingredients (reactant molecules), and those molecules…

EXPLORE →

Experiment

Data Formulator

Data Formulator blends natural language and visual interfaces to help analysts explore and visualize data with AI agents. Starting with any data (clean or messy, small or large), analysts can easily describe their goals to AI agents and collaboratively explore data in different directions in parallel to discover new insights.

EXPLORE →

Experiment

Magentic Marketplace

Magentic Marketplace is an open-source simulation environment for exploring the numerous possibilities of agentic markets and their societal implications at scale. It provides a foundation for studying these markets and guiding them toward outcomes that benefit everyone. 

EXPLORE →

Experiment

Skala

Skala is a deep-learning-based exchange-correlation functional that achieves experimental accuracy in density functional theory. Trained on the largest high-accuracy dataset of molecular energies, Skala advances computational chemistry with reliable, scalable predictions for molecules and materials.

EXPLORE →

Experiment

MatterGen

MatterGen is a diffusion-based generative model for inorganic materials design. It can propose stable, novel crystal structures and be guided by target properties like bulk modulus, band gap, or magnetic density, accelerating materials discovery.

EXPLORE →

Experiment

CalcLM

CalcLM is a prototype from Azure AI Foundry Labs that experiments with bringing agentic AI into the grid interface of Excel. It enables users to express agent steps as formulas, chain tasks across cells, and quickly explore how simple agent workflows might flow in a familiar surface.

EXPLORE →

Experiment

Debug-gym

Debug-gym is an open-source research environment for teaching AI coding agents to debug more like humans—interactively and iteratively. With tools like Python’s pdb, agents can set breakpoints, inspect code, and run tests, enabling smarter, more reliable coding workflows.

EXPLORE →

Experiment

Trellis

Trellis is a research system for generating editable 3D assets from simple text or image prompts. Using a novel latent representation, it produces meshes, radiance fields, and 3D Gaussians with rich texture and structure—accelerating workflows in gaming, AR/VR, digital twins, and industrial design.

EXPLORE →

Experiment

NextCoder

NextCoder is a research model series designed to improve code editing. Using a novel synthetic data pipeline and the SeleKT adaptation algorithm, it teaches language models to handle diverse edit requirements while retaining strong code generation skills, outperforming peers across multiple benchmarks.

EXPLORE →

Experiment

TypeAgent

TypeAgent is research sample code exploring how to build a single personal agent with natural language interfaces. By distilling language models into logical structures for actions, memory, and plans, it enables safer, faster, and lower-cost ways to map user requests into meaningful applications.

EXPLORE →

Experiment

Magentic-UI

Magentic-UI is a research platform for advancing human-in-the-loop AI experiences. It explores co-planning, co-tasking, and task learning. Features like action guards and safe sandboxes enable more trustworthy human-AI collaboration.

EXPLORE →