Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua 2016
DOI: 10.18653/v1/n16-1162
|View full text |Cite
|
Sign up to set email alerts
|

Learning Distributed Representations of Sentences from Unlabelled Data

Abstract: Unsupervised methods for learning distributed representations of words are ubiquitous in today's NLP research, but far less is known about the best ways to learn distributed phrase or sentence representations from unlabelled data. This paper is a systematic comparison of models that learn such representations. We find that the optimal approach depends critically on the intended application. Deeper, more complex models are preferable for representations to be used in supervised systems, but shallow log-linear m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

11
499
0

Year Published

2016
2016
2019
2019

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 413 publications
(510 citation statements)
references
References 27 publications
11
499
0
Order By: Relevance
“…The simplest Average model achieves competitive results while the most complex LSTM model does not show advantages. Mikolov, 2014) 0.7561 FastSent (Hill et al, 2016) 0.7369 Char-CNN (Kim et al, 2016) 0.8095 Charagram (Wieting et al, 2016a) Table 1: Correlation coefficients of model predictions with subject similarity ratings on Chinese sentence similarity task. The bold data refers to best among models with same composition function.…”
Section: Resultsmentioning
confidence: 99%
“…The simplest Average model achieves competitive results while the most complex LSTM model does not show advantages. Mikolov, 2014) 0.7561 FastSent (Hill et al, 2016) 0.7369 Char-CNN (Kim et al, 2016) 0.8095 Charagram (Wieting et al, 2016a) Table 1: Correlation coefficients of model predictions with subject similarity ratings on Chinese sentence similarity task. The bold data refers to best among models with same composition function.…”
Section: Resultsmentioning
confidence: 99%
“…A better continuous space vector representation of the messages might improve SD2 and SP2. Much research has been conducted recently on obtaining better continuous space vector representations of sentences (Le and Mikolov, 2014;Kiros et al, 2015;Hill et al, 2016) instead of centroid vectors. Another direction for future work would be to investigate replacing the SVM classifiers by multilayer perceptrons, possibly on top of recurrent neural nets that would compute vector representations of sentences.…”
Section: Discussionmentioning
confidence: 99%
“…Lastly, we note other recent work that considers a similar transfer learning setting. The FastSent model (Hill et al, 2016) uses the 2014 STS task in its evaluation and reports an average Pearson's r of 61.3. On the same data, the C-PHRASE model (Pham et al, 2015) has an average Pearson's r of 65.7.…”
Section: Sentence Embedding Experimentsmentioning
confidence: 99%