Authors: Marius Belly, Nathanaël Fijalkow, Hugo Gimbert, Florian Horn, Guillermo A. Pérez, Pierre Vandenhove
Abstract: Partially observable Markov decision processes (POMDPs) form a prominent
model for uncertainty in sequential decision making. We are interested in
constructing algorithms with theoretical guarantees to determine whether the
agent has a strategy ensuring a given specification with probability 1. This
well-studied problem is known to be undecidable already for very simple
omega-regular objectives, because of the difficulty of reasoning on uncertain
events. We introduce a revelation mechanism which restricts information loss by
requiring that almost surely the agent has eventually full information of the
current state. Our main technical results are to construct exact algorithms for
two classes of POMDPs called weakly and strongly revealing. Importantly, the
decidable cases reduce to the analysis of a finite belief-support Markov
decision process. This yields a conceptually simple and exact algorithm for a
large class of POMDPs.
Source: http://arxiv.org/abs/2412.12063v1