Evolving Policy Sets for Multi-Policy Decision Making (Preprint)

Preprint, 2021

PDF thumbnail
(PDF, 325.2 KB )


Multi-Policy Decision Making (MPDM) is a planning framework in which an agent dynamically switches between a set of policies by predicting the performance of those policies using forward simulation. But in virtually all MPDM approaches, the set of policies are created by domain experts. In this paper, we learn these policy sets off-line. We use an evolutionary algorithm approach, which allows us to directly optimize the performance of the policy set, rather than some proxy objective. We also propose the use of Terminal, an online strategy game, as an evaluation domain for planning algorithms. Like many real-world robotics problems, Terminal requires multi-agent planning, coping with uncertainty, and practical limits on computational complexity. We describe how we used our approach to generate an agent which is ranked in the top 10 in a global online competition.


    TITLE      = {Evolving Policy Sets for Multi-Policy Decision Making (Preprint)},
    AUTHOR     = {Maximilian Krogius and Edwin Olson},
    BOOKTITLE  = {Preprint},
    YEAR       = {2021},
    MONTH      = {May},
    KEYWORDS   = {Planning, Evolution},