2021
DOI: 10.1109/access.2021.3118093
|View full text |Cite
|
Sign up to set email alerts
|

Inferring Multilingual Domain-Specific Word Embeddings From Large Document Corpora

Abstract: The research leading to these results has been partly funded by the Smart-Data@PoliTO center for Big Data and Machine Learning technologies. Computational resources were provided by HPC@POLITO, a project of Academic Computing within the Department of Control and Computer Engineering at the Politecnico di Torino (http://www.hpc.polito.it).

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 34 publications
0
2
0
Order By: Relevance
“…Various methods for sentence selection in both open and closed domains are discussed in Table 1, taking into account the number of categories considered and the objective of constructing the benchmark dataset. [29] Open Building three benchmarks considering rating numbers in AMA-ZON, extracting sentence pairs from Japanese datasets, and randomly picking from Wikipedia -Having a benchmark in the Japanese language considering cultural/social parameters without using the translation methods in order to evaluate NLU ability in the general domain [66] Close In three domains, medicine, technology, and finance, they pick the sentences in Wikipedia based on the defined categories…”
Section: State Of the Artmentioning
confidence: 99%
See 1 more Smart Citation
“…Various methods for sentence selection in both open and closed domains are discussed in Table 1, taking into account the number of categories considered and the objective of constructing the benchmark dataset. [29] Open Building three benchmarks considering rating numbers in AMA-ZON, extracting sentence pairs from Japanese datasets, and randomly picking from Wikipedia -Having a benchmark in the Japanese language considering cultural/social parameters without using the translation methods in order to evaluate NLU ability in the general domain [66] Close In three domains, medicine, technology, and finance, they pick the sentences in Wikipedia based on the defined categories…”
Section: State Of the Artmentioning
confidence: 99%
“…Recall: Recall assesses the model's capacity to identify all relevant instances and is calculated as the ratio of true positives to the sum of true positives and false negatives. • F1-score: The F1-score, as a measure that balances precision and recall, calculated as the harmonic mean of the two values, is used as the evaluation metric in many types of research related to benchmark performance comparison [66,68,78].…”
mentioning
confidence: 99%
“…Nonetheless, researchers who tackle language-oriented tasks (e.g., Natural Language Processing) have also started to explore, adapt and even propose multilingual methods or offer multilanguage support, e.g., [Cagliero and Quatra 2021, Guarasci et al 2022, Krótkiewicz et al 2016, Pessutto et al 2020]. Regarding the Portuguese language, Morais et al [2020] classify a set of current news from Brazilian news portals into fake, satirical, objective and legitimate news.…”
Section: Related Workmentioning
confidence: 99%