Hidden Markov Models for Bioinformatics

Koski, Timo

doi:10.1007/978-94-010-0612-5

Cited by 161 publications

(68 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The following results (4)-(6) are given in MacDonald and Zucchini (1997) and Koski (2001), and also obtained by Corollary 2 of Kikuchi and Nomakuchi (2002). First, the likelihood function of the observation y = (y 1 , y 2 , .…”

Section: Normal Hidden Markov Modelsupporting

confidence: 57%

Estimation of the Variance for the Maximum Likelihood Estimates in Normal Mixture Models and Normal Hidden Markov Models

Iqbal

Nishi

Kikuchi

et al. 2011

Journal of the Japanese Society of Computational Statistics

View full text Add to dashboard Cite

In this article, we derive the observed information matrices for normal mixture models and normal hidden Markov models. We also describe the parametric bootstrap method for the said models. The matrices and the method mentioned above are used to estimate the variance of the maximum likelihood estimates (MLEs) obtained by the Expectation-Maximization (EM) algorithm. Finally, a numerical example is shown using a data set named "faithful" given in the free statistical software R.

show abstract

Section: Normal Hidden Markov Modelsupporting

confidence: 57%

Estimation of the Variance for the Maximum Likelihood Estimates in Normal Mixture Models and Normal Hidden Markov Models

Iqbal

Nishi

Kikuchi

et al. 2011

Journal of the Japanese Society of Computational Statistics

View full text Add to dashboard Cite

show abstract

“…HMM has a powerful and flexible mathematical structure to make statistical inferences on partially observed stochastic processes. It has been successfully applied to many diverse areas, particularly speech recognition [14][15][16], finance/econometrics [6,[17][18][19], software reliability [20,21], traffic engineering [22], Biology [23], language modeling [24,25], metrology [26][27][28][29], bioinformatics [30][31][32][33], biophysics/biochemistry [34][35][36]. However, HMM has not as widely implemented as it should be in earthquake modeling.…”

Section: Poisson Hidden Markov Modelmentioning

confidence: 99%

Prediction of earthquake hazard by hidden Markov model (around Bilecik, NW Turkey)

2014

View full text Add to dashboard Cite

Earthquakes are one of the most important natural hazards to be evaluated carefully in engineering projects, due to the severely damaging effects on human-life and human-made structures. The hazard of an earthquake is defined by several approaches and consequently earthquake parameters such as peak ground acceleration occurring on the focused area can be determined. In an earthquake prone area, the identification of the seismicity patterns is an important task to assess the seismic activities and evaluate the risk of damage and loss along with an earthquake occurrence. As a powerful and flexible framework to characterize the temporal seismicity changes and reveal unexpected patterns, Poisson hidden Markov model provides a better understanding of the nature of earthquakes. In this paper, Poisson hidden Markov model is used to predict the earthquake hazard in Bilecik (NW

show abstract

“…When there are many competing models, these probabilities can be used to choose the model which best matches the observations, in a way that minimizes the probability of error. The forward algorithm is also used in pattern recognition applications (Fu, 1982), (Vidal et al, 2005) to solve the syntax analysis or parsing problem, i.e., to recognize a pattern by classifying it to the appropriate generating grammar, and in bioinformatics (Durbin et al, 1998), (Koski, 2001) to evaluate whether a DNA sequence or a protein sequence belongs to a particular family of sequences. This chapter begins with an overview of optimal classification schemes for HMMs where the goal is to minimize the probability of error of the classifier.…”

Section: Introductionmentioning

confidence: 99%

Classification of Hidden Markov Models: Obtaining Bounds on the Probability of Error and Dealing with Possibly Corrupted Observations

Athanasopoulou¹,

Hadjicostis²

2011

Hidden Markov Models, Theory and Applications

View full text Add to dashboard Cite

4www.intechopen.com observations, these techniques can be used to choose the HMM that most likely generated the sequence of observations and, in the process, also characterize the associated probability of error (for the given sequence of observations). However, in order to measure the classification capability of the classifier before making any observations, one needs to compute the ap r i o r i probability that the classifier makes an incorrect decision for any of the possible sequences of observations. Enumerating all possible sequences of a given length (in order to evaluate their contribution to the probability of error) is prohibitively expensive for long sequences; thus, we describe ways to avoid this computational complexity and obtain an upper bound on the probability that the classifier makes an error without having to enumerate all possible output sequences. Specifically, we present a constructive approach that bounds the probability of error as a function of the observation step. We also discuss necessary and sufficient conditions for this bound on the probability of error to go to zero as the number of observations increases. After obtaining bounds on the probability of erroneous classification, we consider the additional challenge that the observed sequence is corrupted, due to noise coming from sensor malfunctions, communication limitations, or other adversarial conditions. For example, depending on the underlying application, the information that the sensors provide may be corrupted due to inaccurate measurements, limited resolution, or degraded sensor performance (due to aging or hardware failures). We consider unreliable sensors that may cause outputs to be deleted, inserted, substituted or transposed with certain known probabilities. Under such sensor malfunctions, the length of the observed sequence will generally not equal the length of the output sequence and, in fact, several output sequences may correspond to a given observed sequence. Thus, one would need to first identify all possible state sequences and the probabilities with which they agree with both the underlying model and the observations (after allowing, of course, for sensor failures). In particular, if symbols in the output sequence can be deleted, there may be an infinite number of output sequences that agree with a given observed sequence, which makes the standard forward algorithm inapplicable for classification. This inability of the standard forward algorithm can be overcome via an iterative algorithm that allows us to efficiently compute the probability that a certain model matches the observed sequence: each time a new observation is made, the algorithm simply updates the information it keeps track of and outputs on demand the probability that a given model has produced the sequence observed so far. The iterative algorithm we describe relates to (and generalizes) iterative algorithms for the evaluation problem in HMMs (Rabiner, 1989), the parsing problem in probabilistic automata (PA) (Fu, 1982), (Vidal et al., 2005), a...

show abstract

Hidden Markov Models for Bioinformatics

Cited by 161 publications

References 0 publications

Estimation of the Variance for the Maximum Likelihood Estimates in Normal Mixture Models and Normal Hidden Markov Models

Estimation of the Variance for the Maximum Likelihood Estimates in Normal Mixture Models and Normal Hidden Markov Models

Prediction of earthquake hazard by hidden Markov model (around Bilecik, NW Turkey)

Classification of Hidden Markov Models: Obtaining Bounds on the Probability of Error and Dealing with Possibly Corrupted Observations

Contact Info

Product

Resources

About