2022 7th International Conference on Multimedia and Image Processing 2022
DOI: 10.1145/3517077.3517104
|View full text |Cite
|
Sign up to set email alerts
|

An improved algorithm of TFIDF combined with Naive Bayes

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(2 citation statements)
references
References 4 publications
0
2
0
Order By: Relevance
“…Data mining is the process of extracting potentially useful information and knowledge from a large amount of incomplete, noisy, fuzzy, and random data. Data mining algorithms are broadly categorized into classification algorithms: the C4.5 algorithm is based on information theory and uses information entropy and information gain degree as the measure to achieve inductive classification of the data [21]; the Plain Bayesian algorithm is based on Bayes' theorem with the assumption of conditional independence of features as the classification method [22]; Support Vector Machines (SVMs) [23] map the points of the lowdimensional space into the high-dimensional space so that they become linearly divided and then use the principle of linear division to determine classification boundaries; such an approach also includes the K Nearest Neighbor classification algorithm (KNN) and Adaboost. The K-Means algorithm, when given a set of samples, divides the sample set into K clusters according to the size of the distance between the samples; each object is assigned to the closest clustering center [24]; The EM maximum expectation algorithm is an algorithm used for finding the maximum likelihood estimates of parameters in a probabilistic model, where the probabilistic model relies on unobservable hidden variables.…”
Section: Data Mining Algorithmmentioning
confidence: 99%
“…Data mining is the process of extracting potentially useful information and knowledge from a large amount of incomplete, noisy, fuzzy, and random data. Data mining algorithms are broadly categorized into classification algorithms: the C4.5 algorithm is based on information theory and uses information entropy and information gain degree as the measure to achieve inductive classification of the data [21]; the Plain Bayesian algorithm is based on Bayes' theorem with the assumption of conditional independence of features as the classification method [22]; Support Vector Machines (SVMs) [23] map the points of the lowdimensional space into the high-dimensional space so that they become linearly divided and then use the principle of linear division to determine classification boundaries; such an approach also includes the K Nearest Neighbor classification algorithm (KNN) and Adaboost. The K-Means algorithm, when given a set of samples, divides the sample set into K clusters according to the size of the distance between the samples; each object is assigned to the closest clustering center [24]; The EM maximum expectation algorithm is an algorithm used for finding the maximum likelihood estimates of parameters in a probabilistic model, where the probabilistic model relies on unobservable hidden variables.…”
Section: Data Mining Algorithmmentioning
confidence: 99%
“…Yang, Z. et al [22] proposed Hierarchical attention networks (HAN) for document classification, which maintain a hierarchical structure of word to sentence (building sentence from words) and sentence to document (aggregating sentences to a document representation). Zhang, Z. et al [23] proved that the TFIDF algorithm with the combination of Naive Bayes has significance in the text classification task compared to many complex models.…”
Section: Literature Reviewmentioning
confidence: 99%