2018
DOI: 10.1049/iet-sen.2016.0226
|View full text |Cite
|
Sign up to set email alerts
|

Characterising text mining: a systematic mapping review of the Portuguese language

Abstract: Documents written in natural language constitute a major part of the artefacts produced during the software engineering life cycle. Studies indicate that more than 80% of enterprise data is stored in some sort of unstructured form, mainly as text. Therefore, the growth of user-generated content, especially from social media, provides a huge amount of data which allows discovering the experiences, opinions, and feelings of users. Text mining refers to the set of tools, techniques, and algorithms adopted to extr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
1
0
2

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 11 publications
(9 citation statements)
references
References 20 publications
0
1
0
2
Order By: Relevance
“…Despite significant advances, there are challenges; for instance, due to informal language, idioms, and culturally specific terms, there are few comprehensive linguistic models for different domains and geographic areas [Khurana et al 2023, Pedroso et al 2022]. [Souza et al 2018] conducted a systematic mapping of studies related to the application of text mining to the Portuguese language from 1996 to 2014. The study used an automated search approach in digital libraries and a manual search in several conference proceedings held in Brazil (e.g., PROPOR, BraSNAM, and STIL).…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Despite significant advances, there are challenges; for instance, due to informal language, idioms, and culturally specific terms, there are few comprehensive linguistic models for different domains and geographic areas [Khurana et al 2023, Pedroso et al 2022]. [Souza et al 2018] conducted a systematic mapping of studies related to the application of text mining to the Portuguese language from 1996 to 2014. The study used an automated search approach in digital libraries and a manual search in several conference proceedings held in Brazil (e.g., PROPOR, BraSNAM, and STIL).…”
Section: Related Workmentioning
confidence: 99%
“…The limitations of algorithms and tools for these languages are an important obstacle in this scenario. Likewise, few studies have focused on Brazilian events, as in the case of [Souza et al 2018], whose mapping covered only up to 2014, and [Júnior et al 2020] which was based on studies of international conferences. Therefore, the need for a systematic mapping directed to the NLP in social media analysis comes from the lack of works that show the state of the art focused on Brazilian academic events, in order to fill this gap and provide a comprehensive view of the state of the art in the national context.…”
Section: Related Workmentioning
confidence: 99%
“…Previamente, três algoritmos de AM foram avaliados para a construc ¸ão do classificador de sentimentos: Multinomial Naïve Bayes (MNB), Support Vector Machine (SVM) e Random Forest (RF). Essa escolha se deu pelo fato de que estes são os três algoritmos mais utilizados para classificac ¸ão de texto em Português [Souza et al 2018]. Além disso, seis combinac ¸ões de técnicas de pré-processamento também foram avaliadas: 1) unigram; 2) bigram; 3) unigram + bigram; 4) unigram + remoc ¸ão de stopwords; 5) bigram + remoc ¸ão de stopwords; 6) unigram + bigram + remoc ¸ão de stopwords.…”
Section: Avaliac ¸ãO Dos Classificadoresunclassified
“…We have strictly followed the guidelines proposed by Kitchenham and Charters, 23 Kitchenham et al, 22 Petersen et al, 24 and Petersen et al 25 to achieve an impartial review. These guidelines have been widely adopted in SLR, surveys, and SMS 41‐44 …”
Section: Planning and Conducting The Mappingmentioning
confidence: 99%