Proceedings of the 2018 Conference of the North American Chapter Of the Association for Computational Linguistics: Hu 2018
DOI: 10.18653/v1/n18-1202
|View full text |Cite
|
Sign up to set email alerts
|

Deep Contextualized Word Representations

Abstract: We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy). Our word vectors are learned functions of the internal states of a deep bidirectional language model (biLM), which is pretrained on a large text corpus. We show that these representations can be easily added to existing models and significantly improve the state of the art acros… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

32
6,689
5
50

Year Published

2019
2019
2023
2023

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 9,271 publications
(7,187 citation statements)
references
References 38 publications
32
6,689
5
50
Order By: Relevance
“…Our model in the SQuAD dataset for ensemble showed performance of 81.5% F1 and 73.3% EM. Moreover, we performed performance measurements by combining S 2 ‐Net and ELMo , which had performance of 83.1% F1 (74.6% EM) and 2.3% more improvement than the S 2 ‐Net model. ELMo is a function of the internal states of a deep bidirectional language model (biLM) that is pretrained on a large text corpus.…”
Section: Resultsmentioning
confidence: 99%
“…Our model in the SQuAD dataset for ensemble showed performance of 81.5% F1 and 73.3% EM. Moreover, we performed performance measurements by combining S 2 ‐Net and ELMo , which had performance of 83.1% F1 (74.6% EM) and 2.3% more improvement than the S 2 ‐Net model. ELMo is a function of the internal states of a deep bidirectional language model (biLM) that is pretrained on a large text corpus.…”
Section: Resultsmentioning
confidence: 99%
“…12 Of the best-performing machine learning models tested, XGBoost had 91.9% accuracy. The ANN using an ELMo-BiLSTM-CNN-CRF architecture 13,14 had an accuracy of 94.4%.The methods used were only applied to the VetCompass corpus for companion animals. However, results could be expanded to other species and applied to other datasets within VetCompass.…”
Section: Discussionmentioning
confidence: 99%
“…This approach learns lexical representations from a large unannotated corpus (Devlin, Chang, Lee, & Toutanova, ;Mikolov et al, 2013;Pennington et al, 2014;Peters et al, 2018). The classical model word2vec contains two models that can generate word embeddings (Mikolov et al, 2013).…”
Section: Corpus-based Approachmentioning
confidence: 99%