2007
DOI: 10.1155/2007/90947
|View full text |Cite
|
Sign up to set email alerts
|

NML Computation Algorithms for Tree-Structured Multinomial Bayesian Networks

Abstract: Typical problems in bioinformatics involve large discrete datasets. Therefore, in order to apply statistical methods in such domains, it is important to develop efficient algorithms suitable for discrete data. The minimum description length (MDL) principle is a theoretically well-founded, general framework for performing statistical inference. The mathematical formalization of MDL is based on the normalized maximum likelihood (NML) distribution, which has several desirable theoretical properties. In the case o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2008
2008
2022
2022

Publication Types

Select...
3
1
1

Relationship

2
3

Authors

Journals

citations
Cited by 6 publications
(9 citation statements)
references
References 24 publications
0
9
0
Order By: Relevance
“…According to the MDL principle, learning can be equated with finding regularities with data. Consequently the more the data is compressed the more the data is learnt [9].…”
Section: Minimum Description Length Principlementioning
confidence: 99%
See 1 more Smart Citation
“…According to the MDL principle, learning can be equated with finding regularities with data. Consequently the more the data is compressed the more the data is learnt [9].…”
Section: Minimum Description Length Principlementioning
confidence: 99%
“…The model class is only used as a technical device for constructing an efficient code for describing the data. [9].…”
Section: Minimum Description Length Principlementioning
confidence: 99%
“…Exact and computationally tractable formulas are rare: results for multinomial models are given in [10], and for Bayesian networks with structural restrictions in [11], [12], [13]; more references can be found in [3] and [4]. Similarly to the present work, in the context of structural equation models, Preacher et al [14] estimate the normalizing coefficient by sampling random data-sets from a uniform distribution using Markov chain Monte Carlo (MCMC) methods.…”
Section: Introductionmentioning
confidence: 99%
“…The FFT method involves utilization of Newton's method and is explained in the paper [1]. However, the usefulness of this approach is unclear as some earlier tests with the multinomial normalizing term [12] show that the used floating point numbers must have very high precision in practical cases. This is due to the fact that the values of the normalizing terms can be quite large, and consequently, as the data size increases, the precision of the floating point numbers must also increase.…”
Section: Theorem 2 (The Miller Formula) If Two Formal Power Series Arementioning
confidence: 99%
“…The computational complexity of computing the NML criterion for a Naive Bayes model is the same as for this algorithm, as the numerator of (1) is trivial to compute. Further information on computing the stochastic complexity for Naive Bayes models can be found in papers [11,12].…”
Section: The Algorithmmentioning
confidence: 99%