2019
DOI: 10.1101/705426
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

DeepPrime2Sec: Deep Learning for Protein Secondary Structure Prediction from the Primary Sequences

Abstract: Motivation: Here we investigate deep learning-based prediction of protein secondary structure from the protein primary sequence. We study the function of different features in this task, including one-hot vectors, biophysical features, protein sequence embedding (ProtVec), deep contextualized embedding (known as ELMo), and the Position Specific Scoring Matrix (PSSM). In addition to the role of features, we evaluate various deep learning architectures including the following models/mechanisms and certain combin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
15
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3
2

Relationship

3
7

Authors

Journals

citations
Cited by 27 publications
(17 citation statements)
references
References 27 publications
0
15
0
Order By: Relevance
“…Like natural language processing scenarios, fine-tuning of language model-baed representations improved the downstream supervised task performance, which is particularly evident for small training sets. The success of automatic representation learning approaches in our experiments motivates exploring of contextualized embedding (transformers (Rao et al, 2019) or ELMo embeddings (Asgari et al, 2019b;Heinzinger et al, 2019)) as future directions.…”
Section: Conclusion and Discussionmentioning
confidence: 89%
“…Like natural language processing scenarios, fine-tuning of language model-baed representations improved the downstream supervised task performance, which is particularly evident for small training sets. The success of automatic representation learning approaches in our experiments motivates exploring of contextualized embedding (transformers (Rao et al, 2019) or ELMo embeddings (Asgari et al, 2019b;Heinzinger et al, 2019)) as future directions.…”
Section: Conclusion and Discussionmentioning
confidence: 89%
“…However, in our implementation, we proposed the scale's negative logarithm before normalization, improving its discriminative power. Sequence-based embeddings have been used successfully in protein functional/structural annotations tasks previously such as secondary structure prediction (Li and Yu, 2016;Asgari et al, 2019a), point mutations (Zhou et al, 2020), protein function prediction (Asgari and Mofrad, 2015;Zhou et al, 2019;Bonetta and Valentino, 2020), and predicting structural motifs (Liu et al, 2018). In this paper, we proposed the use of ProtVec embeddings and k-mers for linear BCE prediction improving state-of-theart performance on different datasets.…”
Section: Discussionmentioning
confidence: 99%
“…Also, deep language models, such as BERT 91 and ELMO 46 were originally developed for NLP, and later employed for protein representations 23,28 . Furthermore, Convolutional Neural networks (CNNs), having the ability to learn to summarize the data with adaptive filters, have been employed to represent proteins 23,63,86,102,103 . Additionally, architectures that are capable of inferring patterns from sequential data (e.g., protein sequences) using the attention mechanism 23,55 , such as Long Short-Term Memory (LSTM) neural networks 23,28,44,104,105 and transformer based algorithms 106 , are used in representation methods.…”
Section: Different Approaches For Representing Proteinsmentioning
confidence: 99%