2009 International Multiconference on Computer Science and Information Technology 2009
DOI: 10.1109/imcsit.2009.5352714
|View full text |Cite
|
Sign up to set email alerts
|

Real-time unsupervised classification of web documents

Abstract: International audienceThis paper adresses the problem of clustering dynamic collections of web documents. We show an iterative algorithm based on a fine-grained keyword extraction (simple, compound words and proper nouns). Each new document inserted in the collection is either assigned to an existing class containing documents of the same topic, or assigned to a new class. After each step, when necessary, classes are refined using statistical techniques. The implementation of this algorithm was successfully in… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
references
References 4 publications
0
0
0
Order By: Relevance