2019
DOI: 10.2196/12310
|View full text |Cite
|
Sign up to set email alerts
|

Word Embedding for the French Natural Language in Health Care: Comparative Study

Abstract: Background Word embedding technologies, a set of language modeling and feature learning techniques in natural language processing (NLP), are now used in a wide range of applications. However, no formal evaluation and comparison have been made on the ability of each of the 3 current most famous unsupervised implementations (Word2Vec, GloVe, and FastText) to keep track of the semantic similarities existing between words, when trained on the same dataset. Objective … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
24
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 21 publications
(28 citation statements)
references
References 20 publications
(20 reference statements)
1
24
0
Order By: Relevance
“…Dynomant et al [ 33 ] learned a variety of word embedding models on a corpus of French-language text documents originating from the Rouen University Hospital, in order to compare embedding methods. They used five distinct evaluation tasks resolving around the intrinsic properties of the embeddings.…”
Section: Resultsmentioning
confidence: 99%
“…Dynomant et al [ 33 ] learned a variety of word embedding models on a corpus of French-language text documents originating from the Rouen University Hospital, in order to compare embedding methods. They used five distinct evaluation tasks resolving around the intrinsic properties of the embeddings.…”
Section: Resultsmentioning
confidence: 99%
“…We have already developed a vectorial space trained on EDSaN and generated a hybrid semantic annotator [4,5], and document embeddings to create inter-scientific paper similarities in PubMed [6]. Moreover, our MeSH-gram neural network model extends word embedding vectors with MeSH concepts and improves semantic similarity and relatedness [7].…”
Section: Preliminary Resultsmentioning
confidence: 99%
“…First, the translation process inevitably causes some data loss, even across Indo-European languages. In a French study, the UMNSRS word pair sets were translated into French, and only 73% of the similarity set and 71% of the relatedness set were translated and used [ 21 ]. In a comparable Spanish study, only 65% of the relatedness set and 67% of the similarity set were automatically rendered into Spanish because of regional differences in medical protocols and commercial drug names [ 22 ].…”
Section: Discussionmentioning
confidence: 99%