2020
DOI: 10.3390/app10165505
|View full text |Cite
|
Sign up to set email alerts
|

A Novel Statistic-Based Corpus Machine Processing Approach to Refine a Big Textual Data: An ESP Case of COVID-19 News Reports

Abstract: With developments of modern and advanced information and communication technologies (ICTs), Industry 4.0 has launched big data analysis, natural language processing (NLP), and artificial intelligence (AI). Corpus analysis is also a part of big data analysis. For many cases of statistic-based corpus techniques adopted to analyze English for specific purposes (ESP), researchers extracted critical information by retrieving domain-oriented lexical units. However, even if corpus software embraces algorithms such as… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
29
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
3

Relationship

1
5

Authors

Journals

citations
Cited by 20 publications
(36 citation statements)
references
References 48 publications
0
29
0
Order By: Relevance
“…When competitor methods (i.e., the traditional frequency-based approach 30 and the refined traditional frequency-based approach 46 ) in handling word-ranking issues only based on words' frequency values or range values, respectively, to determine their sequences, namely, traditional methods do not integrally take a word's dispersion and concentration criteria into account. This deficiency will cause critical word-ranking results exist bias, in addition, the importance levels of high-frequency critical words will be challenged.…”
Section: Comparison and Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…When competitor methods (i.e., the traditional frequency-based approach 30 and the refined traditional frequency-based approach 46 ) in handling word-ranking issues only based on words' frequency values or range values, respectively, to determine their sequences, namely, traditional methods do not integrally take a word's dispersion and concentration criteria into account. This deficiency will cause critical word-ranking results exist bias, in addition, the importance levels of high-frequency critical words will be challenged.…”
Section: Comparison and Discussionmentioning
confidence: 99%
“…The traditional frequency-based approach 30 The refined traditional frequency-based approach 44 The proposed approach According to Table 8, there were significant differences in token ranking between the traditional corpus-based computing approaches 30,46 and the proposed approach. The traditional corpus-based computing approaches 30,46 only calculated a token's total frequency values to define its rank and importance; however, the frequency dispersion criteria were not taken into consideration; that is, a token with high frequency may not be widely adopted or used by the RA authors, or may be concentrated in very few RAs or even possibly occur in only one RA. Nevertheless, the proposed approach not only used H-index to compute the dispersion and concentration criteria of frequency simultaneously, but also used frequency values to distinguish tokens that had the same H-index values.…”
Section: Refined Datamentioning
confidence: 99%
See 1 more Smart Citation
“…The studies with a focus on applied linguistics perspective were carried out to discover and explain new coinage related to Covid-19 situation, its influence on other languages and problems arising in the translation and coordination of terminology [9][10][11], generate taxonomies of terms with the help of corpus analysis and estimate word frequencies [12][13][14], collect and systematize massive Covid-19 related text data [15]. The findings shed light on the specificity of scientific and medical language which is significant in specialist and everyday discourse.…”
Section: Introductionmentioning
confidence: 99%
“…According to Haddad & Monterero-Martinez [9], a tremendous vocabulary expansion should be attributed to solving communication needs in specialist and everyday communication by filling lexical gaps. The data from various studies [9][10][11][12][13][14][15] indicated that most productive types of vocabulary development included metaphoric and metonymic transfers. Affixation, compounding, abbreviation, clipping and conversion prevailed over other word-building processes.…”
Section: Introductionmentioning
confidence: 99%