MedLexSp – a medical lexicon for Spanish medical natural language processing

Llanos, Leonardo Campillos

doi:10.1186/s13326-022-00281-5

Cited by 4 publications

(1 citation statement)

References 49 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Finally, the tweet dataset experienced spaCy’s lemmatization to reduce words to their base form, relying on spaCy’s statistical models and parts-of-speech information. Lemmatization is valuable for preserving the sentiment of the original words, which is crucial for accurate sentiment classification, and for maintaining the semantic meaning of words—an essential aspect for identifying and distinguishing topics [ 25 ]. However, stemming was deliberately excluded from the preprocessing pipeline, prioritizing accuracy over computational efficiency.…”

Section: Methodsmentioning

confidence: 99%

Topic prediction for tobacco control based on COP9 tweets using machine learning techniques

Elmitwalli,

Mehegan,

Wellock

et al. 2024

PLoS ONE

View full text Add to dashboard Cite

The prediction of tweets associated with specific topics offers the potential to automatically focus on and understand online discussions surrounding these issues. This paper introduces a comprehensive approach that centers on the topic of "harm reduction" within the broader context of tobacco control. The study leveraged tweets from the period surrounding the ninth Conference of the Parties to review the Framework Convention on Tobacco Control (COP9) as a case study to pilot this approach. By using Latent Dirichlet Allocation (LDA)-based topic modeling, the study successfully categorized tweets related to harm reduction. Subsequently, various machine learning techniques were employed to predict these topics, achieving a prediction accuracy of 91.87% using the Random Forest algorithm. Additionally, the study explored correlations between retweets and sentiment scores. It also conducted a toxicity analysis to understand the extent to which online conversations lacked neutrality. Understanding the topics, sentiment, and toxicity of Twitter data is crucial for identifying public opinion and its formation. By specifically focusing on the topic of “harm reduction” in tweets related to COP9, the findings offer valuable insights into online discussions surrounding tobacco control. This understanding can aid policymakers in effectively informing the public and garnering public support, ultimately contributing to the successful implementation of tobacco control policies.

show abstract

Section: Methodsmentioning

confidence: 99%

Topic prediction for tobacco control based on COP9 tweets using machine learning techniques

Elmitwalli,

Mehegan,

Wellock

et al. 2024

PLoS ONE

View full text Add to dashboard Cite

show abstract

Entity normalization in a Spanish medical corpus using a UMLS-based lexicon: findings and limitations

Báez,

Campillos-Llanos,

Núñez

et al. 2024

Lang Resources & Evaluation

View full text Add to dashboard Cite

Predictors of accuracy in L2 Spanish preterit-imperfect production

Minnillo,

Sánchez-Gutiérrez,

Ruiz-Alonso-Bartol

et al. 2024

IJLCR

View full text Add to dashboard Cite

Few studies have considered the multitude of factors that influence learners’ accuracy of past tense-aspect use in L2 Spanish. The present study fills this gap by examining course-level, task-modality, obligatory tense-aspect, and verb frequency and regularity as predictors of English-dominant learners’ accuracy in contexts that require the Spanish preterit or imperfect. Learner narrations from the COWS-L2H and CEDEL2 corpora were analyzed. Generalized mixed-effects models reveal that obligatory tense-aspect and task-modality are significant predictors of accuracy and that frequency is only a significant predictor in imperfect-obligatory contexts for students from the same Spanish program. Data from one Spanish program is interpreted as providing partial support for the Default Past Tense Hypothesis (DPTH). The findings add complexity to our understanding of the route of preterit-imperfect acquisition, showcasing plateauing effects and highlighting students’ use of the present as a default form.

show abstract

MedLexSp – a medical lexicon for Spanish medical natural language processing

Cited by 4 publications

References 49 publications

Topic prediction for tobacco control based on COP9 tweets using machine learning techniques

Topic prediction for tobacco control based on COP9 tweets using machine learning techniques

Entity normalization in a Spanish medical corpus using a UMLS-based lexicon: findings and limitations

Predictors of accuracy in L2 Spanish preterit-imperfect production

Contact Info

Product

Resources

About