2018
DOI: 10.1093/bioinformatics/bty991
|View full text |Cite
|
Sign up to set email alerts
|

DeeReCT-PolyA: a robust and generic deep learning method for PAS identification

Abstract: Motivation Polyadenylation is a critical step for gene expression regulation during the maturation of mRNA. An accurate and robust method for poly(A) signals (PASs) identification is not only desired for the purpose of better transcripts’ end annotation, but can also help us gain a deeper insight of the underlying regulatory mechanism. Although many methods have been proposed for PAS recognition, most of them are PAS motif- and human-specific, which leads to high risks of overfitting, low gen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
37
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
2

Relationship

3
5

Authors

Journals

citations
Cited by 44 publications
(38 citation statements)
references
References 27 publications
0
37
0
Order By: Relevance
“…The pretrained network is effective in learning some more general beta‐turn features; then the transfer learning technique can transfer the base network to some more specific models that can classify nine‐class beta‐turns. We have also demonstrated some techniques for tuning deep neural networks on small data classification problems, which may be useful in other areas of biological sequence analyses with imbalanced data sets, such as genomic analysis, poly‐signal identification, post‐translational modification prediction, and so forth.…”
Section: Resultsmentioning
confidence: 99%
“…The pretrained network is effective in learning some more general beta‐turn features; then the transfer learning technique can transfer the base network to some more specific models that can classify nine‐class beta‐turns. We have also demonstrated some techniques for tuning deep neural networks on small data classification problems, which may be useful in other areas of biological sequence analyses with imbalanced data sets, such as genomic analysis, poly‐signal identification, post‐translational modification prediction, and so forth.…”
Section: Resultsmentioning
confidence: 99%
“…Another angle to address the prediction problem is through machine learning (ML) approaches, and several such models have been proposed. These include more traditional ML methods like support vector machine and Hidden Markov Model, but also the latest deep learning models, such as DeeReCT-PolyA and DeepGSR (23)(24)(25)(26). Deep learning models generally outperform traditional classifiers because they abandon manually selected features.…”
Section: Current Methods and Limitationsmentioning
confidence: 99%
“…Raw sequences are directly fed in and hidden features may be learnt and modeled. The latter two tools mentioned above have been reported to show around 90% accuracy on test sequences (23,24). However, these models are only trained to distinguish whether a poly(A) signal (PAS, a conserved hexamer motif) is real, but in reality, a small proportion of functional human poly(A) site does not require PAS (27,28).…”
Section: Current Methods and Limitationsmentioning
confidence: 99%
“…There are also a number of breakthroughs in using deep learning to perform biomedical image processing and biomedical diagnosis. For example, [35] proposes a method based on deep neural networks, which can reach dermatologist-level performance in classifying skin cancer; [66] uses transfer learning to solve the data-hungry problem to promote the automatic medical diagnosis; [22] proposes a deep learning method to automatically predict fluorescent labels from transmitted-light images of unlabeled biological samples; [41,160] also propose deep learning methods to analyze 1D data CNN, RNN [198,3,25,156,87,6,80,157,158,175,169] Structure prediction and reconstruction MRI images, Cryo-EM images, fluorescence microscopy images, protein contact map 2D data CNN, GAN, VAE [167,90,38,168,180,196,170] Biomolecular property and function prediction Sequencing data, PSSM, structure properties, microarray gene expression 1D data, 2D data, structured data DNN, CNN, RNN [85,204,75,4] Biomedical image processing and diagnosis CT images, PET images, MRI images 2D data CNN, GAN [35,66,41,22,160] Biomolecule interaction prediction and systems biology Microarray gene expression, PPI, gene-disease interaction, diseasedisease similarity network, diseasevariant network 1D data, 2D data, structured data, graph data CNN, GCN [95,201,203,165,71,…”
Section: Introductionmentioning
confidence: 99%