2001
DOI: 10.1109/18.930925
|View full text |Cite
|
Sign up to set email alerts
|

Adaptive context trees and text clustering

Abstract: Abstract-In the finite-alphabet context we propose four alternatives to fixed-order Markov models to estimate a conditional distribution. They consist in working with a large class of variablelength Markov models represented by context trees, and building an estimator of the conditional distribution with a risk of the same order as the risk of the best estimator for every model simultaneously, in a conditional Kullback-Leibler sense. Such estimators can be used to model complex objects like texts written in na… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2005
2005
2012
2012

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(5 citation statements)
references
References 21 publications
(35 reference statements)
0
5
0
Order By: Relevance
“…The communities are differentiated in a strong sense and in a weak sense in [26]. It worths to notice that many clustering algorithms can be used to discover communities under different conditions if we regard community discovery as a kind of clustering process [32] [34].…”
Section: On Community Discovery and Relation Querymentioning
confidence: 99%
“…The communities are differentiated in a strong sense and in a weak sense in [26]. It worths to notice that many clustering algorithms can be used to discover communities under different conditions if we regard community discovery as a kind of clustering process [32] [34].…”
Section: On Community Discovery and Relation Querymentioning
confidence: 99%
“…The most suitable idea for our setting is the Kullback-Leibler (KL) divergence (see Equation (2)), which gives a notion of the distance between two distributions. The KL divergence is commonly used in several fields to measure the distance between two probability density functions (PDFs, as in, e.g., information theory [35], pattern recognition [31,17]).…”
Section: Comparing Two Distributionsmentioning
confidence: 99%
“…The most suitable idea for our setting is the KL (Kullback-Leibler) divergence (see Equation (3.1)) which gives a notion of distance between two distributions. The KL divergence is commonly used in several fields, to measure the distance between two PDFs (probability density functions, as in, e.g., information theory [22], pattern recognition [19,10]).…”
Section: Related Workmentioning
confidence: 99%