2022
DOI: 10.1155/2022/1883698
|View full text |Cite
|
Sign up to set email alerts
|

A Complete Process of Text Classification System Using State-of-the-Art NLP Models

Abstract: With the rapid advancement of information technology, online information has been exponentially growing day by day, especially in the form of text documents such as news events, company reports, reviews on products, stocks-related reports, medical reports, tweets, and so on. Due to this, online monitoring and text mining has become a prominent task. During the past decade, significant efforts have been made on mining text documents using machine and deep learning models such as supervised, semisupervised, and … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 62 publications
(21 citation statements)
references
References 134 publications
0
11
0
Order By: Relevance
“…Researchers have demonstrated the adaptability of Word2Vec and BERT in the feld of biomedical domain to develop models such as BioWordVec [33] and BioBERT [34], as well as other domain-specifc models such as SciBERT [35] trained on various scientifc and biomedical corpuses, ClinicalBERT [36] trained on clinical notes for various NLP tasks, and MatSciBERT [37] trained on material science publications. Deep learning models that take such trained word representations as input have been employed by researchers to classify unstructured texts documents [38], medical notes [39], health-related social media texts [40], and biomedical text mining tasks [41]. Besides these, handwritten script recognition [42], detection of diseases [43][44][45], and healthcare solutions [46] involve the potential application of deep learning models.…”
Section: Related Workmentioning
confidence: 99%
“…Researchers have demonstrated the adaptability of Word2Vec and BERT in the feld of biomedical domain to develop models such as BioWordVec [33] and BioBERT [34], as well as other domain-specifc models such as SciBERT [35] trained on various scientifc and biomedical corpuses, ClinicalBERT [36] trained on clinical notes for various NLP tasks, and MatSciBERT [37] trained on material science publications. Deep learning models that take such trained word representations as input have been employed by researchers to classify unstructured texts documents [38], medical notes [39], health-related social media texts [40], and biomedical text mining tasks [41]. Besides these, handwritten script recognition [42], detection of diseases [43][44][45], and healthcare solutions [46] involve the potential application of deep learning models.…”
Section: Related Workmentioning
confidence: 99%
“…The model only required five times training using the loss function binary cross-entropy (BCE) [27] using formula (4).…”
Section: A Trainingmentioning
confidence: 99%
“…Six labels in a text that correspond to the basic emotions are used in the classification process [3]. Various techniques, including support vector machine (SVM), naïve Bayes, random forest, convolutional neural networks, have been used in numerous prior research [4]- [7]. The cross-lingual language modelrobustly optimized bidirectional encoder representations from transformers approach (XLM-RoBERTa) model could improve the classification performance of a hate speech text in Indonesian to 89.52%, compared with the previous research using long short term memory (LSTM), which only reached 77.36% optimization [1].…”
mentioning
confidence: 99%
“…The average achieved accuracy was 83.21%, the average F-positive rate was 10.03%, and the average F-measure was 86%. Similarly, Dogra et al [ 48 ] also considered the pattern of the URLs posting in twitter social media platforms by analyzing the behavior of the URLs posting users and URLs clicking users. Using twitter APIs combined with Bitly APIs they collected around 7 million tweets that contain shortened URLs created by Bitly and tried different sets of features including Average clicks, Posting count, Median followers, Median friends, Score function Score Category.…”
Section: Literature Reviewmentioning
confidence: 99%