The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
1963
DOI: 10.1145/321160.321165
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Document Classification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
68
0

Year Published

1980
1980
2018
2018

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 145 publications
(70 citation statements)
references
References 2 publications
0
68
0
Order By: Relevance
“…For WordStat, the analysis was restricted to all words occurring 10 times or more. While the recommended minimum loading value for topic extraction using FA is 0.30 according to [20] or 0.20 as used by [5] this latter criterion resulted in many topics containing fewer than 10 words. The minimum loading criterion was thus reduced to 0.01, allowing for the extraction of 10 words for each topic for all three datasets.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…For WordStat, the analysis was restricted to all words occurring 10 times or more. While the recommended minimum loading value for topic extraction using FA is 0.30 according to [20] or 0.20 as used by [5] this latter criterion resulted in many topics containing fewer than 10 words. The minimum loading criterion was thus reduced to 0.01, allowing for the extraction of 10 words for each topic for all three datasets.…”
Section: Methodsmentioning
confidence: 99%
“…FA was initially aimed to reduce the dimensionality of data to discover the latent content from the data [5,22]. In FA, each word w i in the vocabulary V containing all words in a corpus, ∈ , ∀ ∈ {1, … , }, can be represented as a linear function of m(< n) topics (aka common factors), ∈ , ∀ ∈ {1, … , }.…”
Section: Factor Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…TC is used in many application contexts, ranging from automatic document indexing based on a controlled vocabulary (Borko and Bernick 1963;Gray and Harley 1971;Field 1975), to document filtering (Amati and Crestani 1999;Iyer, Lewis et al 2000;Kim, Hahn et al 2000), word sense disambiguation (Gale, Church et al 1992;Escudero, Marquez et al 2000), population of hierarchical catalogues of Web resources (Chakrabarti, Dom et al 1998;Attardi, Gulli et al 1999;Oh, Myaeng et al 2000), and in general any application requiring document organization or selective and adaptive document dispatching.…”
Section: Text Categorizationmentioning
confidence: 99%
“…Automatic classification of text documents has been one of the biggest challenges in natural language processing for decades [2], [15], [21], [22]. Distinguishing good and bad documents is relevant for various types of real-world situations such as finding useful Web pages or reviewing research papers.…”
Section: Introductionmentioning
confidence: 99%