Combining string and phonetic similarity matching to identify misspelt names of drugs in medical records written in Portuguese

Tissot, Hegler; Dobson, Richard

doi:10.1186/s13326-019-0216-2

Cited by 10 publications

(7 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this work, we focus on clinical datasets 1 obtained from InfoSaude (InfoHealth) [5], an EHR system. An overview of each dataset is presented below and statistics are depicted in Tables I and II -in both datasets, all relations have the domain patient as head type.…”

Section: Methodsmentioning

confidence: 99%

Clinical Knowledge Graph Embedding Representation Bridging the Gap between Electronic Health Records and Prediction Models

Chung

Liu

Tissot

2019

2019 18th IEEE International Conference on Machine Learning and Applications (ICMLA)

Self Cite

View full text Add to dashboard Cite

Learning knowledge embedding representation is an increasingly important technology. However, the choice of hyperparameters is seldom justified and usually relies on exhaustive search. Understanding the effect of hyperparameter combinations on embedding quality is crucial to avoid the inefficient process and enhance practicality of embedding representation along subsequent machine learning applications. This work focuses on translational embedding models for multi-relational categorized data in the clinical domain. We trained and evaluated models with different combinations of hyperparameters on two clinical datasets. We contrasted the results by comparing metric distributions and fitting a random forest regression model. Classifiers were trained to assess embedding representation quality. Finally, clustering was tested as a validation protocol. We observed consistent patterns of hyperparameter preference and identified those that achieved better results respectively. However, results show different patterns regarding link prediction, which is taken as strong evidence that traditional evaluation protocol used for open-domain data does not necessarily lead to the best embedding representation for categorized data.

show abstract

Section: Methodsmentioning

confidence: 99%

Clinical Knowledge Graph Embedding Representation Bridging the Gap between Electronic Health Records and Prediction Models

Chung

Liu

Tissot

2019

2019 18th IEEE International Conference on Machine Learning and Applications (ICMLA)

Self Cite

View full text Add to dashboard Cite

show abstract

“…Moreover, Florianópolis has been using electronic medical records over the last 20 years, from which over 80% of the health network data being stored in digital format since 2008. InfoSaude [16], [17] is an Electronic Health Record (EHR) system created to manage and track medical records used to meet the needs of Florianópolis' 75 public health centers, integrating patient EHRs with multiple information structures, such as distinct types of care, pregnancies, procedures performed on each patient, applied vaccines and drug prescriptions.…”

Section: A Motivationmentioning

confidence: 99%

“…Very simple regular expressions were able to extract the most common facts as well as the language signals for eventual negations. Finally, a hybrid phonetic similarity algorithm [17], [21] was used to find and merge misspellings, positively identified and manually checked for about 1.5% of the mentions in each considered sets of substances and infections.…”

Section: B Datasetmentioning

confidence: 99%

Improving Risk Assessment of Miscarriage During Pregnancy with Knowledge Graph Embeddings

Tissot

Pedebôs

2021

J Healthc Inform Res

View full text Add to dashboard Cite

“…There has been interest in using NLP to develop computable algorithms from free text trial descriptions [33]- [36], and an 'eligibility criteria representation language' has been proposed [37]. However, unless EHR data sources are standardised, it is a major task to enable complex queries to run on disparate data sources [38]. Representation of time constraints also needs to be taken into account [39].…”

Section: A Improving Efficiency Of Clinical Trialsmentioning

confidence: 99%

Natural Language Processing for Mimicking Clinical Trial Recruitment in Critical Care: A Semi-Automated Simulation Based on the LeoPARDS Trial

Tissot

Shah

Brealey

et al. 2020

IEEE J. Biomed. Health Inform.

Self Cite

View full text Add to dashboard Cite

Clinical trials often fail to recruit an adequate number of appropriate patients. Identifying eligible trial participants is resource-intensive when relying on manual review of clinical notes, particularly in critical care settings where the time window is short. Automated review of electronic health records (EHR) may help, but much of the information is in free text rather than a computable form. We applied natural language processing (NLP) to free text EHR data using the CogStack platform to simulate recruitment into the LeoPARDS study, a clinical trial aiming to reduce organ dysfunction in septic shock. We applied an algorithm to identify eligible patients using a moving 1-hour time window, and compared patients identified by our approach with those actually screened and recruited for the trial, for the time period that data were available. We manually reviewed records of a random sample of patients identified by the algorithm but not screened in the original trial. Our method identified 376 patients, including 34 patients with EHR data available who were actually recruited to LeoPARDS in our centre. The sensitivity of CogStack for identifying patients screened was 90% (95% CI 85%, 93%). Of the 203 patients identified by both manual screening and CogStack, the index date matched in 95 (47%) and CogStack was earlier in 94 (47%). In conclusion, analysis of EHR data using NLP could effectively replicate recruitment in a critical care trial, and identify some eligible patients at an earlier stage, potentially improving trial recruitment if implemented in real time.

show abstract

Combining string and phonetic similarity matching to identify misspelt names of drugs in medical records written in Portuguese

Cited by 10 publications

References 16 publications

Clinical Knowledge Graph Embedding Representation Bridging the Gap between Electronic Health Records and Prediction Models

Clinical Knowledge Graph Embedding Representation Bridging the Gap between Electronic Health Records and Prediction Models

Improving Risk Assessment of Miscarriage During Pregnancy with Knowledge Graph Embeddings

Natural Language Processing for Mimicking Clinical Trial Recruitment in Critical Care: A Semi-Automated Simulation Based on the LeoPARDS Trial

Contact Info

Product

Resources

About