Using Reinforcement Learning to Provide Decision Support in Multi-Domain Mass Evacuation Operations

| ISSN: 3005-2092

This article examined a MAJMAR scenario in which a large number of individuals, whose health stochastically deteriorates over time, are stranded at a remote location, and must be evacuated. Within this context, a multi-domain evacuation operation was examined, where individuals are evacuated either by air or sea, with the aim of the operation being to maximize the number of survivors.

Citation:

Rempel M.; Shiell N.

ABSTRACT

In this paper, we study a scenario in which a large number of individuals in various levels of medical distress are stranded at a remote location, such as in the Arctic, and must be evacuated. Set within this context, we examine a multi-domain operation in which the evacuation of individuals occurs via one of two ways, either by helicopter or by ship, each with their own capacity constraints. The aim of this research is to determine a decision policy whose objective is to maximize the number of survivors. This is achieved by seeking a policy that throughout the operation effectively coordinates the selection of those individuals to be evacuated via helicopter and those to be evacuated via ship. Our contributions are twofold. First, we formulate the multi-domain mass evacuation operation as a Markov Decision Process. Second, due to the fact that the curse of dimensionality renders exact methods not applicable, we employ an Artificial Intelligence framework, namely, Reinforcement Learning (RL), also known as Approximate Dynamic Programming (ADP) within operations research, to learn a near-optimal policy. Using a value function approximation based on state aggregation, we design an ADP algorithm to learn a policy within the context of a representative planning scenario. We then apply this policy across a range of test scenarios and compare the outcomes to those achieved using non-coordinated benchmark policies. Although our learned policy does not outperform all benchmarks, our results demonstrate how Artificial Intelligence may be used to evaluate candidate policies and provide decision support in multi-domain operations.

SAS-ORA Conference 2022

Using Reinforcement Learning to Provide Decision Support in Multi-Domain Mass Evacuation Operations

ABSTRACT

DOWNLOAD FULL ARTICLE

OR&A: new ideas, old realities