2003
DOI: 10.1007/s00778-003-0098-9
|View full text |Cite
|
Sign up to set email alerts
|

Fast and accurate text classification via multiple linear discriminant projections

Abstract: Abstract. Support vector machines (SVMs) have shown superb performance for text classification tasks. They are accurate, robust, and quick to apply to test instances. Their only potential drawback is their training time and memory requirement. For n training instances held in memory, the best-known SVM implementations take time proportional to n a , where a is typically between 1.8 and 2.1. SVMs have been trained on data sets with several thousand instances, but Web directories today contain millions of instan… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
28
0
3

Year Published

2009
2009
2022
2022

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 107 publications
(32 citation statements)
references
References 27 publications
1
28
0
3
Order By: Relevance
“…20NG contains 19,997 documents of 20 topics (categories). As in Chakrabarti et al (2003), we employ 75% of the documents for training, and the remaining 25% for testing. Therefore, there are 14,997 training documents and 5,000 test documents, which are uniformly extracted from the original 19,997 documents.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…20NG contains 19,997 documents of 20 topics (categories). As in Chakrabarti et al (2003), we employ 75% of the documents for training, and the remaining 25% for testing. Therefore, there are 14,997 training documents and 5,000 test documents, which are uniformly extracted from the original 19,997 documents.…”
Section: Methodsmentioning
confidence: 99%
“…SVM is a popular technique in TC (e.g., Bennett & Nguyen, 2009; Xue et al, 2008; Qi & Davison, 2008; Chakrabarti et al, 2003; Yang & Lin, 1999). Previous studies often found that SVM outperforms many classifiers.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Texts and documents, especially with weighted feature extraction, generate a huge number of features. Many researchers have applied random projection to text data [83,84] for text mining, text classification, and dimensionality reduction. In this section, we review some basic random projection techniques.…”
Section: Random Projectionmentioning
confidence: 99%
“…With increasing scientific papers, Internet information, and other text-format data, automatic text categorization plays an important role in information retrieval, data mining and machine learning [13]. Commonly used classification methods are back propagation neural network, decision trees, K-nearest neighbor (KNN), naive Bayes and SVM, and especially SVM achieves good performance on the effectiveness and stability of classification [14][15][16][17][18][19][20][21][22][23][24][25][26][27][28]. However, most of them are supervised learning algorithms and training data or labeled samples often demand great human efforts in practical applications.…”
Section: Introductionmentioning
confidence: 99%