We consider a class of infinite-state stochastic games generated by stateless pushdown automata (or, equivalently, 1-exit recursive state machines), where the winning objective is specified by a regular set of target configurations and a qualitative probability constraint '>0' or '=1'. The goal of one player is to maximize the probability of reaching the target set so that the constraint is satisfied, while the other player aims at the opposite. We show that the winner in such games can be determined in P for the '>0' constraint, and in NP ∩ co-NP for the '=1' constraint. Further, we prove that the winning regions for both players are regular, and we design algorithms which compute the associated finite-state automata. Finally, we show that winning strategies can be synthesized effectively.
One-counter MDPs (OC-MDPs) and one-counter simple stochastic games (OC-SSGs) are 1-player, and 2-player turn-based zero-sum, stochastic games played on the transition graph of classic one-counter automata (equivalently, pushdown automata with a 1-letter stack alphabet). A key objective for the analysis and verification of these games is the termination objective, where the players aim to maximize (minimize, respectively) the probability of hitting counter value 0, starting at a given control state and given counter value.Recently, we studied qualitative decision problems ("is the optimal termination value equal to 1?") for OC-MDPs (and OC-SSGs) and showed them to be decidable in polynomial time (in NP ∩ coNP, respectively). However, quantitative decision and approximation problems ("is the optimal termination value at least p", or "approximate the termination value within ε") are far more challenging. This is so in part because optimal strategies may not exist, and because even when they do exist they can have a highly non-trivial structure.It thus remained open even whether any of these quantitative termination problems are computable. In this paper we show that all quantitative approximation problems for the termination value for OC-MDPs and OC-SSGs are computable. Specifically, given an OC-SSG, and given ε > 0, we can compute a value v that approximates the value of the OC-SSG termination game within additive error ε, and furthermore we can compute ε-optimal strategies for both players in the game.A key ingredient in our proofs is a subtle martingale, derived from solving certain linear programs that we can associate with a maximizing OC-MDP. An application of Azuma's inequality on these martingales yields a computable bound for the "wealth" at which a "rich person's strategy" becomes ε-optimal for OC-MDPs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.