Publication Details
An Oracle-Guided Approach to Constrained Policy Synthesis Under Uncertainty
Češka Milan, doc. RNDr., Ph.D. (DITS FIT BUT)
Junges Sebastian (RWTH Aachen University)
Katoen Joost-Pieter (RWTH)
Macák Filip, Ing. (DITS FIT BUT)
Markov decision processes, model-based reasoning, search, decision making under uncertainty
Dealing with aleatoric uncertainty is key in many domains involving sequential decision making, e.g., planning in AI, network protocols, and symbolic program synthesis. This paper presents a general-purpose model-based framework to obtain policies operating in uncertain environments in a fully automated manner. The new concept of coloured Markov Decision Processes (MDPs) enables a succinct representation of a wide range of synthesis problems. A coloured MDP describes a collection of possible policy configurations with their structural dependencies. The framework covers the synthesis of (a) programmatic policies from probabilistic program sketches and (b) finite-state controllers representing policies for partially observable MDPs (POMDPs), including decentralised POMDPs as well as constrained POMDPs. We show that all these synthesis problems can be cast as exploring memoryless policies in the corresponding coloured MDP. This exploration uses a symbiosis of two orthogonal techniques: abstraction refinement-using a novel refinement method-and counter-example generalisation. Our approach outperforms dedicated synthesis techniques on some problems and significantly improves an earlier version of this framework.
@ARTICLE{FITPUB13365, author = "Roman Andriushchenko and Milan \v{C}e\v{s}ka and Sebastian Junges and Joost-Pieter Katoen and Filip Mac\'{a}k", title = "An Oracle-Guided Approach to Constrained Policy Synthesis Under Uncertainty", pages = "433--469", journal = "Journal of Artificial Intelligence Research", volume = 2025, number = 82, year = 2025, ISSN = "1076-9757", doi = "10.1613/jair.1.16593", language = "english", url = "https://www.fit.vut.cz/research/publication/13365" }