Model Router
Real-Time LLM Routing
Try Model Router on Microsoft Foundry → Try on Microsoft Foundry →
About Model Router
Model Router is a trained routing model in Microsoft Foundry that dispatches each prompt in real time to the most suitable underlying LLM, exposing 18 models from OpenAI, Anthropic, DeepSeek, Meta, and xAI behind a single deployment. Operators choose a routing strategy — Balanced for consistency, Cost for expense optimization, or Quality for accuracy — and the router selects models accordingly. The deployment also provides automatic failover when an upstream provider is unavailable, prompt caching across models for identical inputs, and consistent tool-use semantics regardless of which underlying model handles a given call.
Model Router solves the operational complexity that has emerged as the LLM ecosystem fragmented across providers. Rather than manually maintaining provider-specific clients and routing rules, application teams expose one endpoint and let the router learn which models excel at which prompts — coding requests to models with strong logical reasoning, creative work to those with better prose. Unified caching reduces redundant spend across identical inputs, failover removes single-vendor risk, and consistent tool-use semantics let upstream switches happen without application changes. This is core infrastructure for production deployments operating heterogeneous model fleets.
Key capabilities
- Three routing modes: Balanced, Cost, and Quality
- Routes each prompt in real time across 18 underlying LLMs
- Unified deployment spanning OpenAI, Anthropic, DeepSeek, Meta, and xAI
- Built-in failover, prompt caching, and tool-use support
- Trained selector model rather than rule-based dispatch
Ready to Explore?
Dive into platform integrations, source code, research papers, and announcements.