Aggregation of the policy iteration method for nearly completely decomposable Markov chains

Aldhaheri, Rabah W.; Khalil, Hassan K.

doi:10.1109/9.67293

Cited by 36 publications

(51 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Since here the AIB and IB methods coincide 4 , this example shows that the relaxation of the optimization problem does not necessarily lead to the optimal partition.…”

Section: Examplementioning

confidence: 87%

“…Given the transition matrix of a Markov chain, they obtained a bipartition of its state space via alternating projection. Aldhaheri and Khalil considered optimal control of nearly completely decomposable Markov chains and adapted Howard's algorithm to work on an aggregated model [4]. The work of Jia considers state aggregation of Markov decision processes optimal w.r.t.…”

Section: A Contributions and Related Workmentioning

confidence: 99%

“…Indeed, in stochastic modeling in computational biology [1], or in ngram word models in speech recognition [2], dealing with the state space explosion is a major challenge. Also in control theory, particularly for nearly completely decomposable Markov chains, state space reduction is an important topic [3], [4]. A direct way of reducing the state space of a Markov chain is aggregation: With the help of a partition function, groups of nodes in the original transition graph are aggregated, resulting in a graph with a smaller number of nodes.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Optimal Kullback–Leibler Aggregation via Information Bottleneck

Geiger

Petrov

Kubin

et al. 2015

IEEE Trans. Automat. Contr.

View full text Add to dashboard Cite

Abstract-In this paper, we present a method for reducing a regular, discrete-time Markov chain (DTMC) to another DTMC with a given, typically much smaller number of states. The cost of reduction is defined as the Kullback-Leibler divergence rate between a projection of the original process through a partition function and a DTMC on the correspondingly partitioned state space. Finding the reduced model with minimal cost is computationally expensive, as it requires an exhaustive search among all state space partitions, and an exact evaluation of the reduction cost for each candidate partition. Our approach deals with the latter problem by minimizing an upper bound on the reduction cost instead of minimizing the exact cost; The proposed upper bound is easy to compute and it is tight if the original chain is lumpable with respect to the partition. Then, we express the problem in the form of information bottleneck optimization, and propose using the agglomerative information bottleneck algorithm for searching a sub-optimal partition greedily, rather than exhaustively. The theory is illustrated with examples and one application scenario in the context of modeling bio-molecular interactions.

show abstract

“…Since here the AIB and IB methods coincide 4 , this example shows that the relaxation of the optimization problem does not necessarily lead to the optimal partition.…”

Section: Examplementioning

confidence: 87%

Section: A Contributions and Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Optimal Kullback–Leibler Aggregation via Information Bottleneck

Geiger

Petrov

Kubin

et al. 2015

IEEE Trans. Automat. Contr.

View full text Add to dashboard Cite

show abstract

“…Figure 1 shows a 150-point sample of the chain when c = 0.1 in (34). Notice the weak interaction between the two groups (1,2) and (3,4] and the strong interaction between the states in each group. The solid line in Figure 3 shows the error probabilities of the state estimates.…”

Section: Simulation Examplesmentioning

confidence: 99%

“…These techniques have since been studied in the context of singularly perturbed systems. 3, 4 The ideas behind Courtois aggregation have been extended, resulting in the technique called stochastic complementation, Stochastic complementation is applicable whether or not the system is NCD. Unlike Courtois aggregation, the exact steady state probability distribution for the full system can be reconstructed.…”

Section: Introductionmentioning

confidence: 99%

Adaptive estimation of hidden nearly completely decomposable Markov chains with applications in blind equalization

Krishnamurthy

1994

Adaptive Control & Signal

View full text Add to dashboard Cite

This paper proposes maximum likelihood (ML) estimation schemes for nearly completely decomposable Markov chains (NCDMCs) in white Gaussian Noise. Aggregation techniques based on stochastic complementation are applied to significantly reduce the dimension of the resulting hidden Markov model (HMM) and hence substantially reduce the computational requirements of the estimation algorithms. Stochastic complementation results in exact aggregation in that no approximations are involved in the steady state probability distribution of the Markov chain.We then present an off-line estimation algorithm for the parameters and states of the HMM based on the estimation of the aggregated HMM. This off-line algorithm is an ML estimation scheme and is based on the expectation maximization (EM) algorithm. It has a significantly reduced computational complexity compared with the standard (full-order) EM-based HMM estimation scheme. Finally we present an application of our techniques. We show that hidden NCDMCs can be used to formulate the blind equalization problem for noisy FIR channels with Markov inputs, e.g. phase-shiftkeyed (PSK) signals. We then propose recursive EM and gradient estimation techniques for the aggregated HMM resulting in on-line estimates of the channel coefficients and signal estimate. For an &state Markov chain our aggregate-based estimation scheme has a computational complexity O ( N i ) , whereas standard algorithms have a complexity O(N;*') at each time instant, where L is the length of the FIR channel. KEY WORDS EM algorithm Hidden Markov model Markov chains Nearly completely decomposable (NCD)

show abstract

Approximate optimal adaptive control for weakly coupled nonlinear systems: A neuro‐inspired approach

GarcíaźCarrillo

Vamvoudakis

Hespanha

2015

Adaptive Control & Signal

View full text Add to dashboard Cite

Summary This paper proposes a new approximate dynamic programming algorithm to solve the infinite‐horizon optimal control problem for weakly coupled nonlinear systems. The algorithm is implemented as a three‐critic/four‐actor approximators structure, where the critic approximators are used to learn the optimal costs, while the actor approximators are used to learn the optimal control policies. Simultaneous continuous‐time adaptation of both critic and actor approximators is implemented, a method commonly known as synchronous policy iteration. The adaptive control nature of the algorithm requires a persistence of excitation condition to be a priori guaranteed, but this can be relaxed by using previously stored data concurrently with current data in the update of the critic approximators. Appropriate robustifying terms are added to the controllers to eliminate the effects of the residual errors, leading to asymptotic stability of the equilibrium point of the closed‐loop system. Simulation results show the effectiveness of the proposed approach for a sixth‐order dynamical example. Copyright © 2015 John Wiley & Sons, Ltd.

show abstract

Aggregation of the policy iteration method for nearly completely decomposable Markov chains

Cited by 36 publications

References 23 publications

Optimal Kullback–Leibler Aggregation via Information Bottleneck

Optimal Kullback–Leibler Aggregation via Information Bottleneck

Adaptive estimation of hidden nearly completely decomposable Markov chains with applications in blind equalization

Approximate optimal adaptive control for weakly coupled nonlinear systems: A neuro‐inspired approach

Contact Info

Product

Resources

About