Decentralized Graph-Based Multi-Agent Reinforcement Learning using Reward Machines

Background

Multi-agent reinforcement learning (MARL) involves a number of agents interacting within a common environment to jointly maximize a long-term reward. However, the key challenge with MARL is the combinatorial nature, which results in high computational complexity. The behavior of each individual agent relies on what has been learned by the other agents, creating a non-stationary learning problem. Also, when more complex problems are introduced, there are only sparse rewards over a long period of time for each agent, which creates additional challenges for effective learning.

Invention Description

Researchers at Arizona State University have developed a new decentralized graph-based reinforcement learning using reward machines (DGRM) framework that enables a collection of agents to solve complex temporally extended tasks under coupled dynamics with enhanced computation efficiency. This framework uses reward machines (RMs) to describe the environment, track progress through the task, and encode a sparse reward function for each agent. This helps to ensure consistent reward expectations are met and decentralized problem-solving is supported for complex, temporally extended tasks.

Potential Applications:

Decision-making frameworks for autonomous systems
Robotics & unmanned vehicles
Wireless communication networks

Benefits and Advantages:

Consistent reward expectations – transforms non-Markovian rewards into Markovian setting
Decentralized approach – overcomes sparse rewards and non-stationary issues
Enhanced coordination among multiple agents – includes high-level task specifications
Reduces computational complexity – incorporates localized policy learning and truncated Q-functions
Improves scalability – can be applied across multiple complex settings

Inventor(s)

Technology categories

Licensing Contacts