In many real-world problems, there is a dynamic interaction between competitive agents. Partially observable stochastic games (POSGs) are among the most general formal models that capture such dynamic scenarios. The model captures stochastic events, partial information of players about the environment, and the scenario does not have a fixed horizon. Solving POSGs in the most general setting is intractable.Therefore, the research has been focused on subclasses of POSGs that have a value of the game and admit designing (approximate) optimal algorithms. We propose such a subclass for two-player zero-sum games with discounted-sum objective function—POSGs with public observations (POPOSGs)—where each player is able to reconstruct beliefs of the other player over the unobserved states. Our results include: (1) theoretical analysis of PO-POSGs and their value functions showing convexity (concavity) in beliefs of maximizing (minimizing) player, (2) a novel algorithm for approximating the value of the game, and (3) a practical demonstration of scalability of our algorithm. Experimental results show that our algorithm can closely approximate the value of non-trivial games with hundreds of states.
Partially observable Markov decision processes (POMDPs) are the standard models for planning under uncertainty with both finite and infinite horizon. Besides the well-known discountedsum objective, indefinite-horizon objective (aka Goal-POMDPs) is another classical objective for POMDPs. In this case, given a set of target states and a positive cost for each transition, the optimization objective is to minimize the expected total cost until a target state is reached. In the literature, RTDP-Bel or heuristic search value iteration (HSVI) have been used for solving Goal-POMDPs. Neither of these algorithms has theoretical convergence guarantees, and HSVI may even fail to terminate its trials. We give the following contributions: (1) We discuss the challenges introduced in Goal-POMDPs and illustrate how they prevent the original HSVI from converging. (2) We present a novel algorithm inspired by HSVI, termed Goal-HSVI, and show that our algorithm has convergence guarantees. (3) We show that Goal-HSVI outperforms RTDP-Bel on a set of well-known examples.
The Varroa destructor mite is one of the most dangerous Honey Bee (Apis mellifera) parasites worldwide and the bee colonies have to be regularly monitored in order to control its spread. In this paper we present an object detector based method for health state monitoring of bee colonies. This method has the potential for online measurement and processing. In our experiment, we compare the YOLO and SSD object detectors along with the Deep SVDD anomaly detector. Based on the custom dataset with 600 ground-truth images of healthy and infected bees in various scenes, the detectors reached the highest F1 score up to 0.874 in the infected bee detection and up to 0.714 in the detection of the Varroa destructor mite itself. The results demonstrate the potential of this approach, which will be later used in the real-time computer vision based honey bee inspection system. To the best of our knowledge, this study is the first one using object detectors for the Varroa destructor mite detection on a honey bee. We expect that performance of those object detectors will enable us to inspect the health status of the honey bee colonies in real time.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.