# MULTI-AGENT DEEP REINFORCEMENT LEARNING

@inproceedings{Stanford2016MULTIAGENTDR, title={MULTI-AGENT DEEP REINFORCEMENT LEARNING}, author={Maxim Egorov Stanford}, year={2016} }

This work introduces a novel approach for solving reinforcement learning problems in multi-agent settings. We propose a state reformulation of multi-agent problems in R2 that allows the system state to be represented in an image-like fashion. We then apply deep reinforcement learning techniques with a convolution neural network as the Q-value function approximator to learn distributed multi-agent policies. Our approach extends the traditional deep reinforcement learning algorithm by making use… Expand

#### Figures and Tables from this paper

#### 38 Citations

MAGNet: Multi-agent Graph Network for Deep Multi-agent Reinforcement Learning

- Computer Science
- 2019 XVI International Symposium "Problems of Redundancy in Information and Control Systems" (REDUNDANCY)
- 2019

This paper proposes a novel approach to multi-agent reinforcement learning that utilizes a relevance graph representation of the environment obtained by a self-attention mechanism, and a message-generation technique, and shows that it significantly outperforms state-of-the-art MARL solutions. Expand

Deep Reinforcement Learning for Multi-Agent Systems: A Review of Challenges, Solutions and Applications

- Computer Science
- ArXiv
- 2018

This paper addresses an important aspect of deep RL related to situations that require multiple agents to communicate and cooperate to solve complex tasks, including non-stationarity, partial observability, continuous state and action spaces, multi- agent training schemes, and multi-agent transfer learning. Expand

Deep Multi-Agent Reinforcement Learning with Relevance Graphs

- Computer Science
- ArXiv
- 2018

A novel approach to multi-agent reinforcement learning (MARL) is proposed that utilizes a relevance graph representation of the environment obtained by a self-attention mechanism, and a message-generation technique inspired by the NerveNet architecture that significantly outperforms state-of-the-art MARL solutions. Expand

When Does Communication Learning Need Hierarchical Multi-Agent Deep Reinforcement Learning

- Computer Science
- Cybern. Syst.
- 2019

A hierarchical deep reinforcement learning model for multi-agent systems that separates the communication and coordination task from the action picking through a hierarchical policy is contributed. Expand

Deep Reinforcement Learning for Multiagent Systems: A Review of Challenges, Solutions, and Applications

- Computer Science, Mathematics
- IEEE Transactions on Cybernetics
- 2020

A survey of different approaches to problems related to multiagent deep RL (MADRL) is presented, including nonstationarity, partial observability, continuous state and action spaces, multiagent training schemes, and multiagent transfer learning. Expand

Learning Efficient Coordination Strategy for Multi-step Tasks in Multi-agent Systems using Deep Reinforcement Learning

- Computer Science
- ICAART
- 2020

The experimental results showed that two kinds of distributions of the neural networks, centralized and decentralized deep Q-networks (DQNs), could learn coordinated policies to manage agents by using local view inputs, and thus, could improve their entire performance. Expand

CESMA: Centralized Expert Supervises Multi-Agents

- Computer Science
- ArXiv
- 2019

This work considers the reinforcement learning problem of training multiple agents in order to maximize a shared reward, and shows that one can obtain decentralized solutions to a multi-agent problem through imitation learning. Expand

Multiagent Motion Planning Based on Deep Reinforcement Learning in Complex Environments

- Computer Science
- 2021 6th International Conference on Control and Robotics Engineering (ICCRE)
- 2021

A mixed experience multiagent deep deterministic policy gradient algorithm referred to as ME-MADDPG is proposed, which increases the high-quality experience obtained by artificial potential field method and uses dynamic probability to sample from different replay buffers. Expand

Decentralized Multi-Agents by Imitation of a Centralized Controller

- 2021

We consider a multi-agent reinforcement learning problem where each agent seeks to maximize a shared reward while interacting with other agents, and they may or may not be able to communicate.… Expand

Reconfigurable Multi-Agent Manufacturing through Deep Reinforcement Learning: A Research Agenda

- 2019

Multi-agent systems are a key enabler of reconfigurable manufacturing; a key feature of Industry 4.0 intended to reduce time-to-market and enable mass-individualization of products. While significant… Expand

#### References

SHOWING 1-10 OF 44 REFERENCES

Cooperative Multi-agent Control Using Deep Reinforcement Learning

- Computer Science
- AAMAS Workshops
- 2017

It is shown that policy gradient methods tend to outperform both temporal-difference and actor-critic methods and that curriculum learning is vital to scaling reinforcement learning algorithms in complex multi-agent domains. Expand

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

- Computer Science, Mathematics
- NIPS
- 2017

An adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination is presented. Expand

Learning to Communicate with Deep Multi-Agent Reinforcement Learning

- Computer Science
- NIPS
- 2016

By embracing deep neural networks, this work is able to demonstrate end-to-end learning of protocols in complex environments inspired by communication riddles and multi-agent computer vision problems with partial observability. Expand

Continuous control with deep reinforcement learning

- Computer Science, Mathematics
- ICLR
- 2016

This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs. Expand

Asynchronous Methods for Deep Reinforcement Learning

- Computer Science, Mathematics
- ICML
- 2016

A conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers and shows that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input. Expand

Multiagent cooperation and competition with deep reinforcement learning

- Computer Science, Biology
- PloS one
- 2017

The present work shows that Deep Q-Networks can become a useful tool for studying decentralized learning of multiagent systems coping with high-dimensional environments and describes the progression from competitive to collaborative behavior when the incentive to cooperate is increased. Expand

Opponent Modeling in Deep Reinforcement Learning

- Computer Science, Mathematics
- ICML
- 2016

Inspired by the recent success of deep reinforcement learning, this work presents neural-based models that jointly learn a policy and the behavior of opponents, and uses a Mixture-of-Experts architecture to encode observation of the opponents into a deep Q-Network. Expand

QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

- Computer Science, Mathematics
- ICML
- 2018

QMIX employs a network that estimates joint action-values as a complex non-linear combination of per-agent values that condition only on local observations, and structurally enforce that the joint-action value is monotonic in the per- agent values, which allows tractable maximisation of the jointaction-value in off-policy learning. Expand

Is multiagent deep reinforcement learning the answer or the question? A brief survey

- Computer Science
- ArXiv
- 2018

This article provides a clear overview of current multiagent deep reinforcement learning (MDRL) literature and provides guidelines to complement this emerging area by showcasing examples on how methods and algorithms from DRL and multiagent learning (MAL) have helped solve problems in MDRL and providing general lessons learned from these works. Expand

Markov Games as a Framework for Multi-Agent Reinforcement Learning

- Computer Science
- ICML
- 1994

A Q-learning-like algorithm for finding optimal policies and its application to a simple two-player game in which the optimal policy is probabilistic is demonstrated. Expand