Factored Markov Decision Processes

Degris, Thomas; Sigaud, Olivier

doi:10.1002/9781118557426.ch4

Cited by 12 publications

(12 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Here, this ‘curse of dimensionality’ is problematic because the state space is large (3 Θ states) and eradication actions can occur at any node (2 Θ possible actions), making the number of combinations that must be considered during optimisation excessively large. While factored representations can overcome large state space constraints (Degris & Sigaud ), few techniques exist to deal with large action spaces. We apply a method developed for spatial problems, the graph‐based Markov decision process (GMDP) (Sabbadin, Peyrard & Forsell ).…”

Section: Methodsmentioning

confidence: 99%

Finding the best management policy to eradicate invasive species from spatial ecological networks with simultaneous actions

Nicol

Sabbadin

Peyrard

et al. 2017

Journal of Applied Ecology

View full text Add to dashboard Cite

1. Spatial management of invasive species is more likely to be successful when multiple locations are treated simultaneously. However, selecting the best locations to act is difficult due to the many options available at any time. 2. We design a near-optimal policy for applying multiple actions simultaneously for faster invasive species control within a network. Our method uses a recent optimisation tool, the graph-based Markov decision process (GMDP). Since the policy can be difficult to interpret, we extracted a simpler policy using classification trees. We applied our approach to the eradication of invasive mosquitofish Gambusia holbrooki from the habitat of the red-finned blueeye Scaturiginichthys vermeilipinnis, a critically endangered fish with a global population that is restricted to seven artesian springs in Queensland, Australia. 3. The policy returned by the GMDP was to manage springs occupied by mosquitofish and their connected neighbours, unless the neighbours were occupied by red-finned blue-eyes. 4. Simultaneous management resulted in rapid declines in simulated mosquitofish occupancy even if eradication effectiveness was low; however, the cost of simultaneous eradication was high and sustained eradication effort was necessary to maintain low mosquitofish occupancy. 5. Synthesis and applications. Our paper finds a near-optimal, multi-action control policy to remove an invasive species from a multi-species spatial network. We introduce the graphbased Markov decision process and apply it to a real case studyeradication of invasive mosquitofish from the habitat of the red-finned blue-eye. We find that the graph-based Markov decision process can generate policies for networks with extremely large state spaces; however, it works best when nodes have fewer than five neighbours. We conclude that simultaneous eradications are effective for rapid control of invasive species; however, managers should consider the cost and time required for an effective eradication program.

show abstract

Section: Methodsmentioning

confidence: 99%

Finding the best management policy to eradicate invasive species from spatial ecological networks with simultaneous actions

Nicol

Sabbadin

Peyrard

et al. 2017

Journal of Applied Ecology

View full text Add to dashboard Cite

show abstract

“…These problems are implemented within the RLPark software package [7]. These particular problems were chosen both because they represent classic benchmarks for RL and they represent a range of reward structures.…”

Section: Problemsmentioning

confidence: 99%

Genetically-regulated Neuromodulation Facilitates Multi-Task Reinforcement Learning

Cussat‐Blanc

Harrington

2015

Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation

View full text Add to dashboard Cite

In this paper, we use a gene regulatory network (GRN) to regulate a reinforcement learning controller, the StateAction-Reward-State-Action (SARSA) algorithm. The GRN serves as a neuromodulator of SARSA's learning parameters: learning rate, discount factor, and memory depth. We have optimized GRNs with an evolutionary algorithm to regulate these parameters on specific problems but with no knowledge of problem structure. We show that geneticallyregulated neuromodulation (GRNM) performs comparably or better than SARSA with fixed parameters. We then extend the GRNM SARSA algorithm to multi-task problem generalization, and show that GRNs optimized on multiple problem domains can generalize to previously unknown problems with no further optimization.

show abstract

“…Chou et al [18] provide an example of the use of a semi-Markov process for optimizing the time to initiate medical treatment. Factored MDPs recognize that some problems have multiple independent variables that define the state space, a characteristic that can be exploited to some computational advantage (Degris and Sigaud [23]). Another significant extension is to consider the addition of constraints (e.g., constraints on total cost over multiple decision epochs), which leads to challenges in developing algorithms because many such problems no longer retain the attractive decomposable structure of a dynamic program (Altman [3]).…”

Section: Mdp Model Formulationmentioning

confidence: 99%

Optimization of Sequential Decision Making for Chronic Diseases: From Data to Decisions

Denton

2018

Recent Advances in Optimization and Modeling of Contemporary Problems

View full text Add to dashboard Cite

Rapid advances in healthcare for chronic diseases such as cardiovascular disease, cancer, and diabetes have made it possible to detect diseases at early stages and tailor treatment based on individual patient risk factors including demographic factors and disease-specific biomarkers. However, a large number of relevant risk factors, combined with uncertainty in future health outcomes and the side effects of health interventions, makes clinical management of diseases challenging for physicians and patients. Data-driven operations research methods have the potential to help improve medical decision making by using observational data that are now routinely collected in many health systems. Optimization methods in particular, such as Markov decision processes and partially observable Markov decision processes, have the potential to improve the protracted sequential decisionmaking process that is common to many chronic diseases. This tutorial provides an introduction to some of the most commonly used methods for building and solving models to optimize sequential decision making. The context of chronic diseases is emphasized, but the methods apply broadly to sequential decision making under uncertainty. We pay special attention to the challenges associated with using observational data and the influence of model parameter uncertainty and ambiguity. Keywords stochastic dynamic programming • Markov decision process • hidden Markov model • chronic disease • data analytics

show abstract

Factored Markov Decision Processes

Cited by 12 publications

References 11 publications

Finding the best management policy to eradicate invasive species from spatial ecological networks with simultaneous actions

Finding the best management policy to eradicate invasive species from spatial ecological networks with simultaneous actions

Genetically-regulated Neuromodulation Facilitates Multi-Task Reinforcement Learning

Optimization of Sequential Decision Making for Chronic Diseases: From Data to Decisions

Contact Info

Product

Resources

About