2018 6th International Symposium on Digital Forensic and Security (ISDFS) 2018
DOI: 10.1109/isdfs.2018.8355325
|View full text |Cite
|
Sign up to set email alerts
|

A comparative approach for multiclass text analysis

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 7 publications
0
5
0
Order By: Relevance
“…Beautiful Soup is a Python library for pulling data out of HTML and XML files[[1] (paragraph 1)]. Before using BeautifulSoup methods we need [14] is as follows: gensim.corpora is one of the packages that supports implementation of various streaming corpus I/O formats. Dictionary is one of the classes that belongs to this package.…”
Section: Resultsmentioning
confidence: 99%
“…Beautiful Soup is a Python library for pulling data out of HTML and XML files[[1] (paragraph 1)]. Before using BeautifulSoup methods we need [14] is as follows: gensim.corpora is one of the packages that supports implementation of various streaming corpus I/O formats. Dictionary is one of the classes that belongs to this package.…”
Section: Resultsmentioning
confidence: 99%
“…Franko and Burak's study [14] aimed to show how well popular machine learning techniques classify Spanish documents found in digital resources. They selected machine learning classifiers, namely Naive Bayes and Maximum entropy methods, performed a comparative analysis with document models CountVectorizer, TF-IDF, and Hashing vectorizer models.…”
Section: Contributions Of Our Workmentioning
confidence: 99%
“…They selected machine learning classifiers, namely Naive Bayes and Maximum entropy methods, performed a comparative analysis with document models CountVectorizer, TF-IDF, and Hashing vectorizer models. According to Franko and Burak, maximum entropy produces more accurate results with an accuracy value of 0.75 when applied with the HashVectorizer model classifier [ [14] (page5, paragraph 1)]. However, the document category of biografia, which had only 4.2% of the instances in the test set, shows a low f1-score of 0.17.…”
Section: Contributions Of Our Workmentioning
confidence: 99%
“…In reference [3] Semuel Franko and Ismail Burak Parlak have presented multiclass text analysis for the classification problem in Spanish documents. Even if Spanish language is considered as one the most spoken language, classification of text is not carried out due to certain issues in multiclass classification.…”
Section: Related Workmentioning
confidence: 99%