Superior Computer Chess with Model Predictive Control, Reinforcement Learning, and Rollout Algorithms

Modern AI for strategic games including chess has evolved from early rule-based systems such as brute force engines that relied on handcrafted heuristics and exhaustive search, to sophisticated learning-based methods that can create and refine their own strategies. Reinforcement learning (RL) programs including Alpha Zero mastered chess starting with the basic rules, learning to evaluate positions and plan in real time by iteratively improving through trial and error. In parallel, Model Predictive Control (MPC) is a method from control theory, which operates by repeatedly planning a few steps ahead, optimizing decisions based on predictions of future outcomes.

Bringing MPC and RL together produces a powerful hybrid. MPC strategically leverages RL trained evaluators for short term planning, while rollout methods further fine tune decisions through simulated lookahead and online improvement. This convergence is revolutionary for game oriented AI systems that combine learned representations, predictive control, and rollout strategies that are increasingly capable of achieving high performance, adaptability, and deeper lookahead at runtime.

Researchers at Arizona State University have developed a novel chess engine architecture that enhances existing engines by combining model predictive control, reinforcement learning, and rollout algorithms for superior move prediction and evaluation. MPC-MC is an innovative framework that integrates two separate chess engines—a position evaluator and a nominal opponent—to perform a one-step lookahead search, selecting moves based on predicted opponent responses and position scoring. Variants include deterministic and stochastic approaches, as well as strategies to fortify move selection against strong opponents. Experiments demonstrate significant performance improvements across top-tier engines, with potential applicability to other two-player, zero-sum deterministic games.

Potential Applications

Model Predictive Control (MPC)
Broader AI Optimization & Algorithm Design
Adaptive Enemy AI & Non-Player Characters (NPCs)
Real-Time Strategy (RTS) and Sports-Style Games

Benefits and Advantages

Flexible – architecture supporting deterministic and stochastic opponent modeling
Scalable – lookahead depth balances performance with computational cost
Generalizable – framework applicable to other strategic games

For more information about this opportunity, please see

Gundawar et al – arXiv – 2024

For more information about the inventor(s) and their research, please see

Dr. Bertsekas' departmental webpage

Inventor(s)

Technology categories

Licensing Contacts