The World Wide Web Conference 2019
DOI: 10.1145/3308558.3313630
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating Neural Text Simplification in the Medical Domain

Abstract: Health literacy, i.e. the ability to read and understand medical text, is a relevant component of public health. Unfortunately, many medical texts are hard to grasp by the general population as they are targeted at highly-skilled professionals and use complex language and domain-specific terms. Here, automatic text simplification making text commonly understandable would be very beneficial. However, research and development into medical text simplification is hindered by the lack of openly available training a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
23
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 43 publications
(30 citation statements)
references
References 27 publications
0
23
0
Order By: Relevance
“…Deléger and Zweigenbaum (2008) detect paraphrases from comparable medical corpora of specialized and lay texts, and Kloehn et al (2018) explore UMLS (Bodenreider, 2004) and WordNet (Miller, 2009) with word embedding techniques. Furthermore, Van den Bercken et al (2019) directly align sentences from medical terminological articles in Wikipedia and Simple Wikipedia 2 , which confines the editors' vocabulary to only 850 basic English words. Then, they refine these aligned sentences by experts towards automatic evaluation.…”
Section: Text Simplificationmentioning
confidence: 99%
“…Deléger and Zweigenbaum (2008) detect paraphrases from comparable medical corpora of specialized and lay texts, and Kloehn et al (2018) explore UMLS (Bodenreider, 2004) and WordNet (Miller, 2009) with word embedding techniques. Furthermore, Van den Bercken et al (2019) directly align sentences from medical terminological articles in Wikipedia and Simple Wikipedia 2 , which confines the editors' vocabulary to only 850 basic English words. Then, they refine these aligned sentences by experts towards automatic evaluation.…”
Section: Text Simplificationmentioning
confidence: 99%
“…The resulting medical corpus has 3.3k sentence pairs. This corpus is larger than previously generated corpora (by over 1k sentence pairs) and has stricter quality control (Van den Bercken et al, 2019). Our corpus requires a medical sentence to contain 4 or more medical words and belong to medical titles as compared to the no title requirement and needing to contain only 1 medical word, as described in Van den Bercken et al (2019).…”
Section: Simplementioning
confidence: 99%
“…The final medical parallel corpus has 3.3k aligned sentence pairs 1 . Van den Bercken et al (2019) also created a parallel medical corpus by filtering sentence pairs from Wikipedia. Our corpus is significantly larger (45% larger; 2,267 pairs vs. 3,300 pairs) and uses a stricter criteria for identifying sentences: they only required a single word match in the text itself (not the title) and used a lower similarity threshold of 0.75 (vs. our approach of 0.85).…”
Section: Medical Parallel English Wikipedia Corpus Creationmentioning
confidence: 99%
“…While machine translation-based approaches have not yet been proposed for translating eprescription directions, prior works such as (Yolchuyeva et al, 2018;Shardlow and Nawaz, 2019;Van den Bercken et al, 2019) have suggested solving machine translation tasks without the need for explicitly-defined rules. Neural machine translation (NMT) models have been shown to be able to learn contextual rules automatically from large corpora and produce higher quality translations (Bahdanau et al, 2014;Wu et al, 2016b;Lee et al, 2017).…”
Section: Related Workmentioning
confidence: 99%