Encoding features robust to unseen modes of variation with attentive long short-term memory

Baddar, Wissam J.; Ro, Yong Man

doi:10.1016/j.patcog.2019.107159

Cited by 2 publications

(2 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…LSTM+LR. It is the combination of two typical methods: long short-term memory (LSTM) model [46] and logistic regression (LoR) model [47]. The LSTM is a sequential modelling method, and its role is to extract semantic features for posts.…”

Section: Experimental Settingsmentioning

confidence: 99%

Deep Graph neural network-based spammer detection under the perspective of heterogeneous cyberspace

Guo

Tang

Guo

et al. 2021

Future Generation Computer Systems

View full text Add to dashboard Cite

Section: Experimental Settingsmentioning

confidence: 99%

Deep Graph neural network-based spammer detection under the perspective of heterogeneous cyberspace

Guo

Tang

Guo

et al. 2021

Future Generation Computer Systems

View full text Add to dashboard Cite

“…Recurrent neural networks (RNNs), as one of the most widespread neural network architectures, mainly focus on a wide variety of applications where data is sequential, such as text classification [38], language modeling [60], speech recognition [64], machine translation [56]. On the other hand, RNNs have also shown their notable success in image processing [17], including but not limited to text recognition in scenes [55], facial expression recognition [6], visual question answering [4], handwriting recognition [52]. As a well-known architecture of RNNs, LSTM [27] has been widely utilized to encode various input (e.g., image, text, audio and video) to improve the recognition performance [60].…”

Section: Introductionmentioning

confidence: 99%

Efficient and effective training of sparse recurrent neural networks

Liu

Ni'mah

Menkovski

et al. 2021

Neural Comput & Applic

View full text Add to dashboard Cite

Recurrent neural networks (RNNs) have achieved state-of-the-art performances on various applications. However, RNNs are prone to be memory-bandwidth limited in practical applications and need both long periods of training and inference time. The aforementioned problems are at odds with training and deploying RNNs on resource-limited devices where the memory and floating-point operations (FLOPs) budget are strictly constrained. To address this problem, conventional model compression techniques usually focus on reducing inference costs, operating on a costly pre-trained model. Recently, dynamic sparse training has been proposed to accelerate the training process by directly training sparse neural networks from scratch. However, previous sparse training techniques are mainly designed for convolutional neural networks and multi-layer perceptron. In this paper, we introduce a method to train intrinsically sparse RNN models with a fixed number of parameters and floating-point operations (FLOPs) during training. We demonstrate state-of-the-art sparse performance with long short-term memory and recurrent highway networks on widely used tasks, language modeling, and text classification. We simply use the results to advocate that, contrary to the general belief that training a sparse neural network from scratch leads to worse performance than dense networks, sparse training with adaptive connectivity can usually achieve better performance than dense models for RNNs.

show abstract

Encoding features robust to unseen modes of variation with attentive long short-term memory

Cited by 2 publications

References 13 publications

Deep Graph neural network-based spammer detection under the perspective of heterogeneous cyberspace

Deep Graph neural network-based spammer detection under the perspective of heterogeneous cyberspace

Efficient and effective training of sparse recurrent neural networks

Contact Info

Product

Resources

About