About ExACT
ExACT is a research approach that teaches AI agents to explore decision spaces more effectively by combining test-time search with self-learning. It introduces Reflective Monte Carlo Tree Search (R-MCTS), which augments standard MCTS with contrastive reflection on past interactions and multi-agent debate for state evaluation. On the challenging VisualWebArena benchmark, GPT-4o with R-MCTS shows 6–30% relative improvement over prior state-of-the-art across various tasks. ExACT also introduces Exploratory Learning, a training strategy that internalizes search capabilities into the model itself.
After Exploratory Learning fine-tuning, GPT-4o matches 87% of R-MCTS performance while spending significantly less inference compute — a practical step toward deployable exploratory agents. The work demonstrates how vision-language models can develop genuine exploratory reasoning, state evaluation, and backtracking behavior, and shows scaling along both training-time search (data collection) and test-time compute. ExACT is part of Microsoft’s broader research line on building o1-style exploratory agents for complex web and decision-making environments.
Key capabilities
- Test-time compute scaling for agent decision quality
- Reflective MCTS for AI agents exploring environments
- Exploratory Learning balances exploitation vs. exploration
- Adapts to changing environments via reflection
- RL + MCTS research framework with open code
Ready to Explore?
Dive into platform integrations, source code, research papers, and announcements.