Proceedings of the 2nd Clinical Natural Language Processing Workshop 2019
DOI: 10.18653/v1/w19-1901
|View full text |Cite
|
Sign up to set email alerts
|

Effective Feature Representation for Clinical Text Concept Extraction

Abstract: Crucial information about the practice of healthcare is recorded only in free-form text, which creates an enormous opportunity for high-impact NLP. However, annotated healthcare datasets tend to be small and expensive to obtain, which raises the question of how to make maximally efficient uses of the available data. To this end, we develop an LSTM-CRF model for combining unsupervised word representations and hand-built feature representations derived from publicly available healthcare ontologies. We show that … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 10 publications
(4 citation statements)
references
References 29 publications
0
4
0
Order By: Relevance
“…Here, the wikipages were restricted to those that contained SNOMED‐CT concepts. They evaluated the embeddings with three Drug‐Disease Link Prediction datasets as provided by Godefroy and Potts (2019), Dingwall and Potts (2018) and Tao et al (2019). While retrofitting, in both settings, improved the embeddings' performance.…”
Section: Discussionmentioning
confidence: 99%
“…Here, the wikipages were restricted to those that contained SNOMED‐CT concepts. They evaluated the embeddings with three Drug‐Disease Link Prediction datasets as provided by Godefroy and Potts (2019), Dingwall and Potts (2018) and Tao et al (2019). While retrofitting, in both settings, improved the embeddings' performance.…”
Section: Discussionmentioning
confidence: 99%
“…Outcome prediction from clinical text has been studied regarding a variety of outcomes. The most prevalent being in-hospital mortality (Ghassemi et al, 2014;Jo et al, 2017;Suresh et al, 2018;Si and Roberts, 2019), diagnosis prediction (Tao et al, 2019;Liu et al, 2018Liu et al, , 2019a and phenotyping (Liu et al, 2019b;Jain et al, 2019;Oleynik et al, 2019;Pfaff et al, 2020). In recent years, most approaches are based on deep neural networks due to their ability to outperform earlier methods in most settings.…”
Section: Clinical Outcome Predictionmentioning
confidence: 99%
“…While gene embeddings can be directly learned using the GIT model, it has been shown in the field of NLP that the pre-trained word embeddings can significantly improve the performance in other related NLP tasks. 12,14 Such pre-trained word embeddings can capture the knowledge of co-occurrence pattern of the words in languages and exhibit sound semantic properties: words of similar semantic meanings are close in embedding space, e.g., e "each" ≈ e "every" . We therefore propose an algorithm called "Gene2Vec" to pre-train the gene embeddings, which is closely related the skip gram word2vec 12 pre-training algorithm.…”
Section: S2 Gene2vec Algorithm Implementationmentioning
confidence: 99%
“…This representation simply indicates which gene is perturbed, but it does not reflect the functional impact of the SGA, nor can it represent the similarity of distinct SGAs that perturb a common signaling pathway. We conjecture that it is possible to represent an SGA as a low-dimensional vector, in the same manner as the "word embedding" [12][13][14] in the natural language processing (NLP) field, such that the representation reflects the functional impact of a gene on biological systems, and genes sharing similar functions should be closely located in such embedding space. Here the "similar function" is broadly defined, e.g., genes from the same pathway or of the same biological process.…”
Section: Introductionmentioning
confidence: 99%