2020 International Conference on INnovations in Intelligent SysTems and Applications (INISTA) 2020
DOI: 10.1109/inista49547.2020.9194669
|View full text |Cite
|
Sign up to set email alerts
|

Tuning the Turkish Text Classification Process Using Supervised Machine Learning-based Algorithms

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0
2

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(13 citation statements)
references
References 14 publications
0
11
0
2
Order By: Relevance
“…Also, tuning BOW size is one of the most effective methods to improve classification accuracy. 9 To investigate the effect of BOW size, we perform detailed tests using BOW size starting from 100, 500, and 1000 to 500,000 incrementing 1000 applying all nine classifiers on the dataset. Figures 5 and 6 show the F1-score variation for the different BOW sizes for the TTC-3600 and TTC-4900 datasets.…”
Section: Bow Sizementioning
confidence: 99%
See 1 more Smart Citation
“…Also, tuning BOW size is one of the most effective methods to improve classification accuracy. 9 To investigate the effect of BOW size, we perform detailed tests using BOW size starting from 100, 500, and 1000 to 500,000 incrementing 1000 applying all nine classifiers on the dataset. Figures 5 and 6 show the F1-score variation for the different BOW sizes for the TTC-3600 and TTC-4900 datasets.…”
Section: Bow Sizementioning
confidence: 99%
“…This article is the extended version of our previous study. 9 In this study, we classify two public Turkish news datasets used for benchmarking in this domain, namely TTC-3600 10 and TTC-4900 11 datasets. These datasets include Turkish news data and are widely used as benchmarking datasets in several studies.…”
mentioning
confidence: 99%
“…Although some researchers apply different parsing techniques for multilingual data [22], we use two-step lemmatization and two-step removal of the stop words, as shown in Figure 4. Details of this process are given in our previous study [23]. Since we are dealing with two different languages (Turkish, English), we have two different steps to eliminate stop words and two different steps for lemmatizing.…”
Section: Effect Of Preprocessing On Bug Classificationmentioning
confidence: 99%
“…69 Examples of setups and applications are (but not limited to) social media, 70 healthcare, [71][72][73] information retrieval, 74 sentiment analysis, [75][76][77][78][79] content-based recommender systems, 80 document summarization, 81,82 various business and marketing applications, [83][84][85] and legal document categorization. 86 A variety of languages were targeted over time for the popular text classification task, including well-studied languages, such as Arabic, 87,88 Turkish, 83,[89][90][91] French, 71,92 Spanish, 72 and Indian, 93 as well as underresourced languages, such as Romanian. 94 The applied classification techniques range from shallow methods, such as Logistic Regression, 95 SVM, 96 and Naïve Bayes, 97 to more complex and resource-hungry deep neural networks, such as CNNs, 62,98 Hierarchical Attention Networks (HANs), 99 and the powerful transformer-based methods that started to dominate the landscape in recent years.…”
Section: Text Classificationmentioning
confidence: 99%