← Back to Innovations
Framework & SDK General-Purpose Experimental

ExACT

Reflective MCTS for AI Agents

Explore ExACT on GitHub → Explore on GitHub →
ExACT

About ExACT

ExACT is a research approach that teaches AI agents to explore decision spaces more effectively by combining test-time search with self-learning. It introduces Reflective Monte Carlo Tree Search (R-MCTS), which augments standard MCTS with contrastive reflection on past interactions and multi-agent debate for state evaluation. On the challenging VisualWebArena benchmark, GPT-4o with R-MCTS shows 6–30% relative improvement over prior state-of-the-art across various tasks. ExACT also introduces Exploratory Learning, a training strategy that internalizes search capabilities into the model itself.

After Exploratory Learning fine-tuning, GPT-4o matches 87% of R-MCTS performance while spending significantly less inference compute — a practical step toward deployable exploratory agents. The work demonstrates how vision-language models can develop genuine exploratory reasoning, state evaluation, and backtracking behavior, and shows scaling along both training-time search (data collection) and test-time compute. ExACT is part of Microsoft’s broader research line on building o1-style exploratory agents for complex web and decision-making environments.

Key capabilities

  • Test-time compute scaling for agent decision quality
  • Reflective MCTS for AI agents exploring environments
  • Exploratory Learning balances exploitation vs. exploration
  • Adapts to changing environments via reflection
  • RL + MCTS research framework with open code
Technology Stack
Python RL MCTS
Technology Stack
Python RL MCTS