2010
DOI: 10.1142/s0218213010000285
|View full text |Cite
|
Sign up to set email alerts
|

A Naïve Bayes Classifier for Web Document Summaries Created by Using Word Similarity and Significant Factors

Abstract: Text classification categorizes web documents in large collections into predefined classes based on their contents. Unfortunately, the classification process can be time-consuming and users are still required to spend considerable amount of time scanning through the classified web documents to identify the ones with contents that satisfy their information needs. In solving this problem, we first introduce CorSum, an extractive single-document summarization approach, which is simple and effective in performing … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
2
0

Year Published

2013
2013
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(2 citation statements)
references
References 22 publications
0
2
0
Order By: Relevance
“…There are various methods for this sentence selection. These include splitting sentences into ‘summary sentence’ or ‘non-summary sentence’ by machine learning methods such as Naïve-Bayes [6], decision trees [7], hidden Markov model (HMM) [8], support vector machines (SVM) [9] and reinforcement learning [10].…”
Section: Related Workmentioning
confidence: 99%
“…There are various methods for this sentence selection. These include splitting sentences into ‘summary sentence’ or ‘non-summary sentence’ by machine learning methods such as Naïve-Bayes [6], decision trees [7], hidden Markov model (HMM) [8], support vector machines (SVM) [9] and reinforcement learning [10].…”
Section: Related Workmentioning
confidence: 99%
“…Methods such as longest common subsequence, n-grams and fingerprint are considered as this kind of methods. The comparison units adopted include words, sentences, human defined sliding window or an n-gram [6][7][8][9][10][11][12]. The Syntactical methods use text's syntactical units for comparing the similarity between documents.…”
mentioning
confidence: 99%