Adversary decision-making using Markov models
MetadataShow full item record
This study conducts three experiments on adversary decision-making modeled as a graph. Each experiment has the overall goal to understand how to exploit an adversary’s decision-making in order to obtain desired outcomes, as well as specific goals unique to each experiment. The first experiment models adversary decision-making using an Absorbing Markov chain (AMC). A sensitivity analysis of states (nodes in the graph) and actions (edges in the graph) is conducted which informs how downstream adversary decisions could be manipulated. The next experiment uses a Markov decision process (MDP). Assuming the adversary is initially blind to the rewards they will receive when they take an action, a Q´learning algorithm is used to determine the sequence of actions that maximizes the adversary rewards (called an optimum policy). This experiment gives insight in the possible decision-making of an adversary. Lastly, in the third experiment a two-player Markov game is developed, played by an agent (friend) and the adversary (foe). The agents goal is to decrease the overall rewards the adversary receives when it follows optimum policy. All experiments are demonstrated using specific examples.
Elizabeth Andreas, Jessica Dorismond, and Marco Gamarra "Adversary decision-making using Markov models", Proc. SPIE 12538, Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications V, 125380I (12 June 2023); https://doi.org/10.1117/12.2655223