A review on the long short-term memory model

Houdt, Greg Van; Mosquera, Carlos; Nápoles, Gonzalo

doi:10.1007/s10462-020-09838-1

Cited by 683 publications

(185 citation statements)

References 135 publications

Supporting

Mentioning

183

Contrasting

Unclassified

Order By: Relevance

“…We use Long–Short-term Memory (LSTM), 44 TransformerCPI, and TAPE as baselines, as shown in Table 1 . Two variants of LSTM models are tested to compare with the above three groups of experiments: LSTM with distilled triplets and distilled singlets.…”

Section: Methodsmentioning

confidence: 99%

MSA-Regularized Protein Sequence Transformer toward Predicting Genome-Wide Chemical-Protein Interactions: Application to GPCRome Deorphanization

Lim

Abbu

Qiu

et al. 2021

J. Chem. Inf. Model.

View full text Add to dashboard Cite

Small molecules play a critical role in modulating biological systems. Knowledge of chemical–protein interactions helps address fundamental and practical questions in biology and medicine. However, with the rapid emergence of newly sequenced genes, the endogenous or surrogate ligands of a vast number of proteins remain unknown. Homology modeling and machine learning are two major methods for assigning new ligands to a protein but mostly fail when sequence homology between an unannotated protein and those with known functions or structures is low. In this study, we develop a new deep learning framework to predict chemical binding to evolutionary divergent unannotated proteins, whose ligand cannot be reliably predicted by existing methods. By incorporating evolutionary information into self-supervised learning of unlabeled protein sequences, we develop a novel method, distilled sequence alignment embedding (DISAE), for the protein sequence representation. DISAE can utilize all protein sequences and their multiple sequence alignment (MSA) to capture functional relationships between proteins without the knowledge of their structure and function. Followed by the DISAE pretraining, we devise a module-based fine-tuning strategy for the supervised learning of chemical–protein interactions. In the benchmark studies, DISAE significantly improves the generalizability of machine learning models and outperforms the state-of-the-art methods by a large margin. Comprehensive ablation studies suggest that the use of MSA, sequence distillation, and triplet pretraining critically contributes to the success of DISAE. The interpretability analysis of DISAE suggests that it learns biologically meaningful information. We further use DISAE to assign ligands to human orphan G-protein coupled receptors (GPCRs) and to cluster the human GPCRome by integrating their phylogenetic and ligand relationships. The promising results of DISAE open an avenue for exploring the chemical landscape of entire sequenced genomes.

show abstract

Section: Methodsmentioning

confidence: 99%

MSA-Regularized Protein Sequence Transformer toward Predicting Genome-Wide Chemical-Protein Interactions: Application to GPCRome Deorphanization

Lim

Abbu

Qiu

et al. 2021

J. Chem. Inf. Model.

View full text Add to dashboard Cite

show abstract

“…Table 6 summarizes the main configurations of the MLP model. Long-Short Term Memory [31,32]. A Long-Short Term Memory (LSTM) is a type of Recurrent Neural Network (RNN).…”

Section: Modelsmentioning

confidence: 99%

Intelligent Cyber Attack Detection and Classification for Network-Based Intrusion Detection Systems

Oliveira¹,

Praça²,

Maia³

et al. 2021

Applied Sciences

View full text Add to dashboard Cite

With the latest advances in information and communication technologies, greater amounts of sensitive user and corporate information are shared continuously across the network, making it susceptible to an attack that can compromise data confidentiality, integrity, and availability. Intrusion Detection Systems (IDS) are important security mechanisms that can perform the timely detection of malicious events through the inspection of network traffic or host-based logs. Many machine learning techniques have proven to be successful at conducting anomaly detection throughout the years, but only a few considered the sequential nature of data. This work proposes a sequential approach and evaluates the performance of a Random Forest (RF), a Multi-Layer Perceptron (MLP), and a Long-Short Term Memory (LSTM) on the CIDDS-001 dataset. The resulting performance measures of this particular approach are compared with the ones obtained from a more traditional one, which only considers individual flow information, in order to determine which methodology best suits the concerned scenario. The experimental outcomes suggest that anomaly detection can be better addressed from a sequential perspective. The LSTM is a highly reliable model for acquiring sequential patterns in network traffic data, achieving an accuracy of 99.94% and an f1-score of 91.66%.

show abstract

“…We use LSTM [44] and Transformer (distilled triplets) as baselines. Two variants of LSTM models are tested to compare with the above three groups of experiments: LSTM with distilled triplets and distilled singlets.…”

Section: Experiments Designmentioning

confidence: 99%

A deep learning framework for elucidating whole-genome chemical interaction space

Lim

Abbu

Qiu

et al. 2020

Preprint

View full text Add to dashboard Cite

Molecular interaction is the foundation of biological process. Elucidation of genome-wide binding partners of a biomolecule will address many questions in biomedicine. However, ligands of a vast number of proteins remain elusive. Existing methods mostly fail when the protein of interest is dissimilar from those with known functions or structures. We develop a new deep learning framework DISAE that incorporates biological knowledge into self-supervised learning techniques for predicting ligands of novel unannotated proteins on a genome-scale. In the rigorous benchmark studies, DISAE outperforms state-of-the-art methods by a significant margin. The interpretability analysis of DISAE suggests that it learns biologically meaningful information. We further use DISAE to assign ligands to human orphan G-Protein Coupled Receptors (GPCRs) and to cluster the human GPCRome by integrating their phylogenetic and ligand relationships. The promising results of DISAE open an avenue for exploring the chemical landscape of entire sequenced genomes.

show abstract

A review on the long short-term memory model

Cited by 683 publications

References 135 publications

MSA-Regularized Protein Sequence Transformer toward Predicting Genome-Wide Chemical-Protein Interactions: Application to GPCRome Deorphanization

MSA-Regularized Protein Sequence Transformer toward Predicting Genome-Wide Chemical-Protein Interactions: Application to GPCRome Deorphanization

Intelligent Cyber Attack Detection and Classification for Network-Based Intrusion Detection Systems

A deep learning framework for elucidating whole-genome chemical interaction space

Contact Info

Product

Resources

About