Towards Conflict Resolution with Deep Multi-Agent Reinforcement Learning

Paper ID

ATM-2021-072

Conference

USA/Europe ATM R&D Seminar

Year

2021

Theme

Separation

Project Name

SESAR 2020 ER3 project Engage

Keywords:

Air Traffic Management, Conflict Resolution, Multi-Agent Deep Deterministic Policy Gradient, reinforcement learning

Authors

Ralvi Isufaj, David Aranega Sebastia and Miquel Angel Piera

DOI

–

Project Number

783287

Link

Download

Abstract

Safety in ATM at the tactical level is ensured by human controllers. Automatic Detection and Resolution (CD&R) tools are one way to assist controllers in their tasks. However, the majority of existing methods do not account for factors that can affect the quality and efficiency of resolutions. Furthermore, future challenges such as sustainability and the environmental impact of aviation must be tackled. In this work, we propose an innovative approach to pairwise conflict resolution, by modelling it as a Multi-Agent Reinforcement Learning (MARL) to improve the quality of resolutions based on a combination of several factors. We use Multi-Agent Deep Deterministic Policy Gradient (MADDPG) to generate resolution maneuvers. We propose a reward function that besides solving the conflicts attempts to optimize the resolutions in terms of time, fuel consumption and airspace complexity. The models are evaluated on real traffic, with a data augmentation technique utilized to increase the variance of conflict geometries. We achieve promising results with a resolution rate of 93%, without the agents having any previous knowledge of the dynamics of the environment. Furthermore, the agents seem to be able to learn some desirable behaviors such as preferring small heading changes to solve conflicts in one time step. Nevertheless, the non-stationarity of the environment makes the learning procedure non-trivial. We argue ways that tangible qualities such as resolution rate and intangible qualities such as resolution acceptability and explainability can be improved.