Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1011
|View full text |Cite
|
Sign up to set email alerts
|

Learning Compressed Sentence Representations for On-Device Text Processing

Abstract: Vector representations of sentences, trained on massive text corpora, are widely used as generic sentence embeddings across a variety of NLP problems. The learned representations are generally assumed to be continuous and real-valued, giving rise to a large memory footprint and slow retrieval speed, which hinders their applicability to low-resource (memory and computation) platforms, such as mobile devices. In this paper, we propose four different strategies to transform continuous and generic sentence embeddi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
8
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 9 publications
(8 citation statements)
references
References 29 publications
(43 reference statements)
0
8
0
Order By: Relevance
“…Simultaneously, we find that our static embeddings substantially outperform Word2Vec and GloVe and therefore suggests our method serves the dual purpose of being a lightweight mechanism for generating static embeddings that track with advances in contextualized representations. Since static embeddings have significant advantages with respect to speed, computational resources, and ease of use, these results have important implications for resource-constrained settings (Shen et al, 2019), environmental concerns (Strubell et al, 2019), and the broader accessibility of NLP technologies. 2 Alongside more developed methods for embedding analysis, the static embedding setting is also equipped with a richer body of work regarding social bias.…”
Section: Introductionmentioning
confidence: 99%
“…Simultaneously, we find that our static embeddings substantially outperform Word2Vec and GloVe and therefore suggests our method serves the dual purpose of being a lightweight mechanism for generating static embeddings that track with advances in contextualized representations. Since static embeddings have significant advantages with respect to speed, computational resources, and ease of use, these results have important implications for resource-constrained settings (Shen et al, 2019), environmental concerns (Strubell et al, 2019), and the broader accessibility of NLP technologies. 2 Alongside more developed methods for embedding analysis, the static embedding setting is also equipped with a richer body of work regarding social bias.…”
Section: Introductionmentioning
confidence: 99%
“…Research works have been carried out to model the order of words when learning the distributed sentence representation (Le and Mikolov, 2014;Kiros et al, 2015;Conneau et al, 2017;Pagliardini et al, 2018;Gupta et al, 2019;Shen et al, 2019). Le and Mikolov propose Doc2vec (Le and Mikolov, 2014) to add a paragraph vector to represent the missing information from the current context.…”
Section: Sentence Embeddingmentioning
confidence: 99%
“…Gupta et al (2019) propose two modifications of Word2vec by considering higher-order word n-grams along with uni-gram during training. Shen et al (2019) use InferSent (Conneau et al, 2017) for sentence embeddings based on word vectors learned by Glove (Pennington et al, 2014) or FastText (Joulin et al, 2017). Gupta et al (2019) claim that training word embeddings along with higher n-gram embeddings helps in the removal of the contextual information from the uni-gram, resulting in better stand-alone word embeddings.…”
Section: Sentence Embeddingmentioning
confidence: 99%
“…Quantization and other compression techniques have been explored for word embeddings (Shu and Nakayama, 2017;Tissier et al, 2019) and sentence embeddings (Shen et al, 2019 Fan et al, 2020). Quantization is complementary to the approaches we consider and is explored more in Section 5.…”
Section: Related Workmentioning
confidence: 99%