Foundry Labs

Get a glimpse of potential future directions for AI, with these experimental technologies from Microsoft Research, and more.

  • SocialReasoning-Bench

    SocialReasoning-Bench

    SocialReasoning-Bench is an open-source benchmark from Microsoft Research AI Frontiers that measures whether AI agents can negotiate competently and act in their user’s best interest in multi-party settings.

    EXPLORE →

  • Microsoft Agent Framework

    Microsoft Agent Framework

    The open-source framework for developers and AI engineers building production agentic applications — the direct successor to Semantic Kernel and AutoGen.

    EXPLORE →

  • EO/OS Object Detection

    EO/OS Object Detection

    A Microsoft first-party Earth Observation and Overhead Sensing model that identifies and localizes objects in satellite and aerial imagery.

    EXPLORE →

  • Model Router

    Model Router

    A trained language model in Microsoft Foundry that routes each prompt in real time to the most suitable underlying LLM.

    EXPLORE →

  • MAI-Image-2-Efficient

    MAI-Image-2-Efficient

    Microsoft’s latest text-to-image model, engineered for builders who need high-quality image generation at speed and scale.

    EXPLORE →

  • harrier-oss-v1

    harrier-oss-v1

    Harrier-OSS-v1 is a family of open-source multilingual text embedding models from Microsoft that delivers state-of-the-art retrieval and semantic understanding across 94 languages, available in three sizes from 270M to 27B parameters.

    EXPLORE →

  • MAI-Voice-1 

    MAI-Voice-1 

    MAI-Voice-1 is an advanced next generation neural text to speech (TTS) model

    EXPLORE →

  • MAI-Transcribe-1 

    MAI-Transcribe-1 

    Speech recognition model that supports up to 25 languages

    EXPLORE →

  • MAI-Image-2

    MAI-Image-2

    State-of-the-art text-to-image generation model from Microsoft AI

    EXPLORE →

  • VibeVoice ASR

    VibeVoice ASR

    A unified speech‑to‑text model designed to transcribe up to 60 minutes of continuous audio in a single pass.

    EXPLORE →

  • Phi-4-Reasoning-Vision-15B

    Phi-4-Reasoning-Vision-15B

    A compact and smart open‑weight multimodal reasoning model that balances reasoning power, efficiency, and training data needs.

    EXPLORE →

  • BugPilot

    BugPilot

    Complex bug generation for efficient learning of software engineering skills.

    EXPLORE →

  • Rho-Alpha

    Rho-Alpha

    Rho-alpha (ρα), is the first robotics model derived from Microsoft’s Phi series of vision-language models.

    EXPLORE →

  • OptiMind

    OptiMind

    OptiMind is a small language model designed to convert business problems described in natural language into the mathematical formulations needed by optimization software.

    EXPLORE →

  • GigaTIME

    GigaTIME

    GigaTIME is a multimodal AI model for translating routinely available hematoxylin and eosin (H&E) pathology slides to virtual multiplex immunofluorescence (mIF) images.

    EXPLORE →

  • Fara-7B

    Fara-7B

    Fara‑7B is an open-weight, ultra-compact agentic small language model (SLM) designed for computer use, enabling on-device automation of real-world web tasks through direct interaction with interfaces like a human would.

    EXPLORE →

  • Promptions

    Promptions

    Promptions dynamically generates UI to help users steer AI responses more effectively. It’s simple, effective, and easily customizable, making it suitable for developers from individual vibe-coders to enterprise software engineers.

    EXPLORE →

  • RosettaFold3

    RosettaFold3

    RosettaFold3 (RF3) is a unified biomolecular modeling system that predicts 3D structures of proteins, nucleic acids, and small molecules within a single framework.

    EXPLORE →

  • MMCT-Agent

    MMCT-Agent

    Complex visual reasoning on images and long-form videos

    EXPLORE →

  • RetroChimera

    RetroChimera

    RetroChimera is a model that takes as input a target molecule that one wants to synthesize, encoded as a sequence of characters (using the SMILES notation), and produces several potential chemical reactions which could be used to produce that input molecule. Each reaction is represented as a group of ingredients (reactant molecules), and those molecules…

    EXPLORE →

  • Data Formulator

    Data Formulator

    Data Formulator blends natural language and visual interfaces to help analysts explore and visualize data with AI agents. Starting with any data (clean or messy, small or large), analysts can easily describe their goals to AI agents and collaboratively explore data in different directions in parallel to discover new insights.

    EXPLORE →

  • Magentic Marketplace

    Magentic Marketplace

    Magentic Marketplace is an open-source simulation environment for exploring the numerous possibilities of agentic markets and their societal implications at scale. It provides a foundation for studying these markets and guiding them toward outcomes that benefit everyone. 

    EXPLORE →

  • Skala

    Skala

    Skala is a deep-learning-based exchange-correlation functional that achieves experimental accuracy in density functional theory. Trained on the largest high-accuracy dataset of molecular energies, Skala advances computational chemistry with reliable, scalable predictions for molecules and materials.

    EXPLORE →

  • MatterGen

    MatterGen

    MatterGen is a diffusion-based generative model for inorganic materials design. It can propose stable, novel crystal structures and be guided by target properties like bulk modulus, band gap, or magnetic density, accelerating materials discovery.

    EXPLORE →

  • CalcLM

    CalcLM

    CalcLM is a prototype from Azure AI Foundry Labs that experiments with bringing agentic AI into the grid interface of Excel. It enables users to express agent steps as formulas, chain tasks across cells, and quickly explore how simple agent workflows might flow in a familiar surface.

    EXPLORE →

  • Debug-gym

    Debug-gym

    Debug-gym is an open-source research environment for teaching AI coding agents to debug more like humans—interactively and iteratively. With tools like Python’s pdb, agents can set breakpoints, inspect code, and run tests, enabling smarter, more reliable coding workflows.

    EXPLORE →

  • Trellis

    Trellis

    Trellis is a research system for generating editable 3D assets from simple text or image prompts. Using a novel latent representation, it produces meshes, radiance fields, and 3D Gaussians with rich texture and structure—accelerating workflows in gaming, AR/VR, digital twins, and industrial design.

    EXPLORE →

  • NextCoder

    NextCoder

    NextCoder is a research model series designed to improve code editing. Using a novel synthetic data pipeline and the SeleKT adaptation algorithm, it teaches language models to handle diverse edit requirements while retaining strong code generation skills, outperforming peers across multiple benchmarks.

    EXPLORE →

  • TypeAgent

    TypeAgent

    TypeAgent is research sample code exploring how to build a single personal agent with natural language interfaces. By distilling language models into logical structures for actions, memory, and plans, it enables safer, faster, and lower-cost ways to map user requests into meaningful applications.

    EXPLORE →

  • Magentic-UI

    Magentic-UI

    Magentic-UI is a research platform for advancing human-in-the-loop AI experiences. It explores co-planning, co-tasking, and task learning. Features like action guards and safe sandboxes enable more trustworthy human-AI collaboration.

    EXPLORE →

Join the community and stay connected

Be the first to know about the latest AI innovations and accelerate your journey.