2023
DOI: 10.1186/s13326-022-00281-5
|View full text |Cite
|
Sign up to set email alerts
|

MedLexSp – a medical lexicon for Spanish medical natural language processing

Abstract: Background Medical lexicons enable the natural language processing (NLP) of health texts. Lexicons gather terms and concepts from thesauri and ontologies, and linguistic data for part-of-speech (PoS) tagging, lemmatization or natural language generation. To date, there is no such type of resource for Spanish. Construction and content This article describes an unified medical lexicon for Medical Natural Language Processing in Spanish. MedLexSp inclu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 49 publications
0
1
0
Order By: Relevance
“…Finally, the tweet dataset experienced spaCy’s lemmatization to reduce words to their base form, relying on spaCy’s statistical models and parts-of-speech information. Lemmatization is valuable for preserving the sentiment of the original words, which is crucial for accurate sentiment classification, and for maintaining the semantic meaning of words—an essential aspect for identifying and distinguishing topics [ 25 ]. However, stemming was deliberately excluded from the preprocessing pipeline, prioritizing accuracy over computational efficiency.…”
Section: Methodsmentioning
confidence: 99%
“…Finally, the tweet dataset experienced spaCy’s lemmatization to reduce words to their base form, relying on spaCy’s statistical models and parts-of-speech information. Lemmatization is valuable for preserving the sentiment of the original words, which is crucial for accurate sentiment classification, and for maintaining the semantic meaning of words—an essential aspect for identifying and distinguishing topics [ 25 ]. However, stemming was deliberately excluded from the preprocessing pipeline, prioritizing accuracy over computational efficiency.…”
Section: Methodsmentioning
confidence: 99%