On-Line Expectation–Maximization Algorithm for latent Data Models

Cappé, Olivier; Moulines, Éric

doi:10.1111/j.1467-9868.2009.00698.x

Cited by 366 publications

(434 citation statements)

References 20 publications

Supporting

Mentioning

426

Contrasting

Unclassified

Order By: Relevance

“…Since our online approach builds upon the EM formalism presented by Cappé et al [4], our notation in the following derivations is closely related to that of the latter. If we write our parameter vector θ as…”

Section: A Preliminariesmentioning

confidence: 99%

“…Figure 5 shows an example of online parameter estimation using our EM framework, on a data set comprising 5000 samples. Cappé et al [4] resolve issues due to the dependency on the initialization by updating onlyŝ for a number of first iterations. Thus, we use the first 100 observations to build up an estimate ofŝ (100) , before calculating the first parameter estimateθ (101) .…”

Section: Online Estimationmentioning

confidence: 99%

“…Variations of γ (k) = 1/k have shown to produce good convergence speed [4] -typically, the choice γ defines the trade-off between adaptability and stability of the estimate. In practice, one should bound γ from below to allow the estimation algorithm to continuously adapt to a changing environment.…”

Section: Online Estimationmentioning

confidence: 99%

See 2 more Smart Citations

Online model estimation of ultra-wideband TDOA measurements for mobile robot localization

Prorok

Gonon

Martinoli

2012

2012 IEEE International Conference on Robotics and Automation

View full text Add to dashboard Cite

Abstract-Ultra-wideband (UWB) localization is a recent technology that promises to outperform many indoor localization methods currently available. Yet, non-line-of-sight (NLOS) positioning scenarios can create large biases in the time-difference-of-arrival (TDOA) measurements, and must be addressed with accurate measurement models in order to avoid significant localization errors. In this work, we first develop an efficient, closed-form TDOA error model and analyze its estimation characteristics by calculating the Cramér-Rao lower bound (CRLB). We subsequently detail how an online Expectation Maximization (EM) algorithm is adopted to find an elegant formalism for the maximum likelihood estimate of the model parameters. We perform real experiments on a mobile robot equipped with an UWB emitter, and show that the online estimation algorithm leads to excellent localization performance due to its ability to adapt to the varying NLOS path conditions over time.

show abstract

Section: A Preliminariesmentioning

confidence: 99%

Section: Online Estimationmentioning

confidence: 99%

See 1 more Smart Citation

Online model estimation of ultra-wideband TDOA measurements for mobile robot localization

Prorok

Gonon

Martinoli

2012

2012 IEEE International Conference on Robotics and Automation

View full text Add to dashboard Cite

show abstract

“…The online EM was first introduced in [25]. It was subsequently generalized in [26], where the authors present a proof convergence and a proposition of asymptotic equivalency to natural gradient ascent. Moreover, the algorithm performs simple and efficient updates, making good use of the sufficient statistics of the exponential family, so as to avoid redundant calculations.…”

Section: A Proposed Approachmentioning

confidence: 99%

Online quantum mixture regression for trajectory learning by demonstration

Korkinof

Demiris

2013

2013 IEEE/RSJ International Conference on Intelligent Robots and Systems

View full text Add to dashboard Cite

Abstract-In this work, we present the online Quantum Mixture Model (oQMM), which combines the merits of quantum mechanics and stochastic optimization. More specifically it allows for quantum effects on the mixture states, which in turn become a superposition of conventional mixture states. We propose an efficient stochastic online learning algorithm based on the online Expectation Maximization (EM), as well as a generation and decay scheme for model components. Our method is suitable for complex robotic applications, where data is abundant or where we wish to iteratively refine our model and conduct predictions during the course of learning. With a synthetic example, we show that the algorithm can achieve higher numerical stability. We also empirically demonstrate the efficacy of our method in well-known regression benchmark datasets. Under a trajectory Learning by Demonstration setting we employ a multi-shot learning application in joint angle space, where we observe higher quality of learning and reproduction. We compare against popular and well-established methods, widely adopted across the robotics community.

show abstract

“…For instance, if the coefficients of the reward functions evolve at each time-step according to a linear model, then its parameters can be estimated through the online expectation maximization algorithm [20]. In the dynamic MABC problem, an accurate model of the dynamics can improve not only the estimation accuracy of the reward function parameters, but also provide valuable information for the action selection process.…”

Section: Estimation Of Time-varying Reward Functionsmentioning

confidence: 99%

Prospects for Bandit Solutions in Sensor Management

Pavlidis

Adams

Nicholson

et al. 2010

The Computer Journal

View full text Add to dashboard Cite

Sensor management in information-rich and dynamic environments can be posed as a sequential action selection problem with side information. To study such problems we employ the dynamic multi-armed bandit with covariates framework. In this generalization of the multi-armed bandit, the expected rewards are time-varying linear functions of the covariate vector. The learning goal is to associate the covariate with the optimal action at each instance, essentially learning to partition the covariate space adaptively. Applications of sensor management are frequently in environments in which the precise nature of the dynamics is unknown. In such settings, the sensor manager tracks the evolving environment by observing only the covariates and the consequences of the selected actions. This creates difficulties not encountered in static problems, and changes the exploitationexploration dilemma. We study the relationship between the different factors of the problem and provide interesting insights. The impact of the environment dynamics on the action selection problem is influenced by the covariate dimensionality. We present the surprising result that strategies that perform very little or no exploration perform surprisingly well in dynamic environments.

show abstract

On-Line Expectation–Maximization Algorithm for latent Data Models

Cited by 366 publications

References 20 publications

Online model estimation of ultra-wideband TDOA measurements for mobile robot localization

Online model estimation of ultra-wideband TDOA measurements for mobile robot localization

Online quantum mixture regression for trajectory learning by demonstration

Prospects for Bandit Solutions in Sensor Management

Contact Info

Product

Resources

About