MAI-Thinking-1
Mid-Sized Sparse MoE Reasoning Model
Try MAI-Thinking-1 on Microsoft Foundry → Try on Microsoft Foundry →
About MAI-Thinking-1
MAI-Thinking-1 is Microsoft AI’s first large language model and first dedicated reasoning model — a 35-billion-active, roughly 1-trillion-total parameter sparse mixture-of-experts that activates only the slices of the network needed for each request. The model was trained from the ground up on clean data with no distillation from third-party frontier models, end-to-end on Microsoft’s own infrastructure across architecture, pre-training, post-training, reasoning, evaluation, and safety. The 200K-token context window lets it analyze long documents, run multi-step reasoning, and process extended agent traces without chunking, and it supports function calling and developer instructions out of the box.
On hard benchmarks it punches well above its active size — matching Claude Opus 4.6 on SWE-Bench Pro at substantially lower cost, reaching 95.9% on AIME 2025, and earning gold at the 2025 International Math Olympiad. In blind side-by-side evaluations, users rated it at parity with Claude Sonnet 4.6. By owning the model-building loop end-to-end and skipping third-party distillation, Microsoft positions MAI-Thinking-1 as the foundation for enterprise reasoning workloads — complex multi-step instructions, long-context analysis, and code generation — at a lower cost.
Key capabilities
- 35B-active / ~1T-total sparse mixture-of-experts trained from scratch on clean data
- No distillation from third-party frontier models — Microsoft owns the model-building loop end-to-end
- Matches Claude Opus 4.6 on SWE-Bench Pro; gold at IMO 2025; 95.9% on AIME 2025
- 200K-token context for long documents, multi-step reasoning, and extended agent traces
- Function calling and developer instructions for production agent workloads
Ready to Explore?
Dive into platform integrations, source code, research papers, and announcements.