Many deep learning architectures have been proposed to model the compositionality in text sequences, requiring a substantial number of parameters and expensive computations. However, there has not been a rigorous evaluation regarding the added value of sophisticated compositional functions. In this paper, we conduct a point-by-point comparative study between Simple Word-Embeddingbased Models (SWEMs), consisting of parameter-free pooling operations, relative to word-embedding-based RNN/CNN models. Surprisingly, SWEMs exhibit comparable or even superior performance in the majority of cases considered. Based upon this understanding, we propose two additional pooling strategies over learned word embeddings: (i) a max-pooling operation for improved interpretability; and (ii) a hierarchical pooling operation, which preserves spatial (n-gram) information within text sequences. We present experiments on 17 datasets encompassing three tasks: (i) (long) document classification; (ii) text sequence matching; and (iii) short text tasks, including classification and tagging. The source code and datasets can be obtained from https:// github.com/dinghanshen/SWEM.
Word embeddings are effective intermediate representations for capturing semantic regularities between words, when learning the representations of text sequences. We propose to view text classification as a label-word joint embedding problem: each label is embedded in the same space with the word vectors. We introduce an attention framework that measures the compatibility of embeddings between text sequences and labels. The attention is learned on a training set of labeled samples to ensure that, given a text sequence, the relevant words are weighted higher than the irrelevant ones. Our method maintains the interpretability of word embeddings, and enjoys a built-in ability to leverage alternative sources of information, in addition to input text sequences. Extensive results on the several large text datasets show that the proposed framework outperforms the state-of-the-art methods by a large margin, in terms of both accuracy and speed. U B Q r r d W a j 2 Y 9 N 2 M b D h P x o T D / 4 R L x 6 0 u 3 I Q c J K m L + + 9 a W e m D W L O j P X 9 D 6 + w s r q 2 v l H c L G 2 V t 3 d 2 K 3 v 7 D 0 Y l m k K L K q 5 0 O y A G O J P Q s s x y a M c a i A g 4 P A b P 1 5 n + O A J t m J L 3 d h x D T 5 C B Z B G j x D q q X 3 n q S n i h S g g i w 7 R 7 J 4 i d p m k 3 i P D d d F q a 0 w K w Z J S L i o d m L N y G c 3 L R O F l 0 T T L H Z N S v V P 2 a n w d e B v U Z q K J Z N P u V t 2 6 o a C J A W s q J M Z 2 6 H 9 t e S r R l l I M 7 M z E Q E / p M B t B x U B I B p p f m M 5 n i Y 8 e E O F L a L W l x z v 7 N S I k w W X 3 O 6 Z o e m k U t I / / T O o m N L n o p k 3 F i Q d L f i 6 K E Y 6 t w N m A c M g 3 U 8 r E D h G r m a s V 0 S D S h 1 j 1 D y Q 2 h v t j y M m i d 1 i 5 r / u 1 Z t X E 1 m 0 Y R H a I j d I L q 6 B w 1 0 A 1 q o h a i 6 B V 9 o m 8 P e e / e U B Q r r d W a j 2 Y 9 N 2 M b D h P x o T D / 4 R L x 6 0 u 3 I Q c J K m L + + 9 a W e m D W L O j P X 9 D 6 + w s r q 2 v l H c L G 2 V t 3 d 2 K 3 v 7 D 0 Y l m k K L K q 5 0 O y A G O J P Q s s x y a M c a i A g 4 P A b P 1 5 n + O A J t m J L 3 d h x D T 5 C B Z B G j x D q q X 3 n q S n i h S g g i w 7 R 7 J 4 i d p m k 3 i P D d d F q a 0 w K w Z J S L i o d m L N y G c 3 L R O F l 0 T T L H Z N S v V P 2 a n w d e B v U Z q K J Z N P u V t 2 6 o a C J A W s q J M Z 2 6 H 9 t e S r R l l I M 7 M z E Q E / p M B t B x U B I B p p f m M 5 n i Y 8 e E O F L a L W l x z v 7 N S I k w W X 3 O 6 Z o e m k U t I / / T O o m N L n o p k 3 F i Q d L f i 6 K E Y 6 t w N m A c M g 3 U 8 r E D h G r m a s V 0 S D S h 1 j 1 D y Q 2 h v t j y M m i d 1 i 5 r / u 1 Z t X E 1 m 0 Y R H a I j d I L q 6 B w 1 0 A 1 q o h a i 6 B V 9 o m 8 P e e / e U B Q r r d W a j 2 Y 9 N 2 M b D h P x o T D / 4 R L x 6 0 u 3 I Q c J K m L + + 9 a W e m D W L O j P X 9 D 6 + w s r q 2 v l H c L G 2 V t 3 d 2 K 3 v 7 D 0 Y l m k K L K q 5 0 O y A G O J P Q s s x y a M c a i A g 4 P A b P 1 5 n + O A J t m J L 3 d h x D T 5 C B Z B G j x D q q X 3 n q S n i h S g g i w 7 R 7 J 4 i d p m k 3 i P D d d F q a 0 w K w Z J S L i o d m L N y G c 3 L R O F l 0 T T L H Z N S v V ...
Drylands cover 41% of Earth's surface and are the largest source of interannual variability in the global carbon sink. Drylands are projected to experience accelerated expansion over the next century, but the implications of this expansion on variability in gross primary production (GPP) remain elusive. Here we show that by 2100 total dryland GPP will increase by 12 ± 3% relative to the 2000-2014 baseline. Because drylands will largely expand into formerly productive ecosystems, this increase in dryland GPP may not increase global GPP. Further, GPP per unit dryland area will decrease as degradation of historical drylands outpaces the higher GPP of expanded drylands. Dryland expansion and climate-induced conversions among sub-humid, semi-arid, arid, and hyper-arid subtypes will lead to substantial changes in regional and subtype contributions to global dryland GPP variability. Our results highlight the vulnerability of dryland subtypes to more frequent and severe climate extremes and suggest that regional variations will require different mitigation strategies.
Semantic hashing has become a powerful paradigm for fast similarity search in many information retrieval systems. While fairly successful, previous techniques generally require two-stage training, and the binary constraints are handled ad-hoc. In this paper, we present an end-to-end Neural Architecture for Semantic Hashing (NASH), where the binary hashing codes are treated as Bernoulli latent variables. A neural variational inference framework is proposed for training, where gradients are directly backpropagated through the discrete latent variable to optimize the hash function. We also draw connections between proposed method and rate-distortion theory, which provides a theoretical foundation for the effectiveness of the proposed framework. Experimental results on three public datasets demonstrate that our method significantly outperforms several state-of-the-art models on both unsupervised and supervised scenarios.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.