Identification of deleterious synonymous variants in human genomes

Buske, O. J.; Manickaraj, A.; Mital, S.; Ray, P. N.; Brudno, M.

doi:10.1093/bioinformatics/btu765

Cited by 12 publications

(12 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Several splice-site effect predictors (discussed later) predict the impact of both intronic and exonic variations on splicing. The Silent Variation Analyzer (SilVA) is a method for pri-oritization of harmful synonymous variations [Buske et al, 2013]. The majority of the variations in the training data lead to a splicing defect but there are also variations that alter the methylation pattern or translational efficiency.…”

Section: Predictors For Synonymous Variationsmentioning

confidence: 99%

Variation Interpretation Predictors: Principles, Types, Performance, and Choice

2016

View full text Add to dashboard Cite

Next-generation sequencing methods have revolutionized the speed of generating variation information. Sequence data have a plethora of applications and will increasingly be used for disease diagnosis. Interpretation of the identified variants is usually not possible with experimental methods. This has caused a bottleneck that many computational methods aim at addressing. Fast and efficient methods for explaining the significance and mechanisms of detected variants are required for efficient precision/personalized medicine. Computational prediction methods have been developed in three areas to address the issue. There are generic tolerance (pathogenicity) predictors for filtering harmful variants. Gene/protein/disease-specific tools are available for some applications. Mechanism and effect-specific computer programs aim at explaining the consequences of variations. Here, we discuss the different types of predictors and their applications. We review available variation databases and prediction methods useful for variation interpretation. We discuss how the performance of methods is assessed and summarize existing assessment studies. A brief introduction is provided to the principles of the methods developed for variation interpretation as well as guidelines for how to choose the optimal tools and where the field is heading in the future.

show abstract

Section: Predictors For Synonymous Variationsmentioning

confidence: 99%

Variation Interpretation Predictors: Principles, Types, Performance, and Choice

2016

View full text Add to dashboard Cite

show abstract

“…One 17 of the challenges in bioinformatics is accurate identification of splice sites in DNA 18 sequences. The discovery of splicing has elucidated the diversity of protein production 19 and explained the increased coding potential of the genome. The DNA sequence is 20 formed of alternating introns and exons, in the first stage, the DNA sequence 21 transcribed into pre-mRNA, then, splicing process takes place by removing the 22 non-coding sequences (introns) from the pre-mRNA to form mRNA sequence.…”

mentioning

confidence: 99%

On the Depth of Deep Learning Models for Splice Site Identification

Elsousy

Kathiresan

Boughorbel

2018

Preprint

View full text Add to dashboard Cite

The success of deep learning has been shown in various fields including computer vision, speech recognition, natural language processing and bioinformatics. The advance of Deep Learning in Computer Vision has been an important source of inspiration for other research fields. The objective of this work is to adapt known deep learning models borrowed from computer vision such as VGGNet, Resnet and AlexNet for the classification of biological sequences. In particular, we are interested by the task of splice site identification based on raw DNA sequences. We focus on the role of model architecture depth on model training and classification performance.We show that deep learning models outperform traditional classification methods (SVM, Random Forests, and Logistic Regression) for large training sets of raw DNA sequences. Three model families are analyzed in this work namely VGGNet, AlexNet and ResNet. Three depth levels are defined for each model family. The models are benchmarked using the following metrics: Area Under ROC curve (AUC), Number of model parameters, number of floating operations. Our extensive experimental evaluation show that shallow architectures have an overall better performance than deep models. We introduced a shallow version of ResNet, named S-ResNet. We show that it gives a good trade-off between model complexity and classification performance. Author summaryDeep Learning has been widely applied to various fields in research and industry. It has 1 been also succesfully applied to genomics and in particular to splice site identification. 2We are interested in the use of advanced neural networks borrowed from computer

show abstract

“…The splicing regulatory elements used in our models include ESE SR‐protein SF2/ASF from ESEfinder (Smith et al, ), ESS FAS‐hex3 hexamer from FAS‐ESS (Wang et al, ), and putative ESE and ESS pESE/pESS (Zhang, Kangsamaksin, Chao, Banerjee, & Chasin, ). These features were scored using scripts provided by SilVA program (Buske, Manickaraj, Mital, Ray, & Brudno, ). As SilVA was designed for only synonymous mutations, we slightly modified the scripts so that they can be applied to other single‐nucleotide variants (SNVs) or indels, in exons or introns.…”

Section: Methodsmentioning

confidence: 99%

Predicting the change of exon splicing caused by genetic variant using support vector regression

Chen

Zhao

et al. 2019

Human Mutation

View full text Add to dashboard Cite

Alternative splicing can be disrupted by genetic variants that are related to diseases like cancers. Discovering the influence of genetic variations on the alternative splicing will improve the understanding of the pathogenesis of variants. Here, we developed a new approach, PredPSI‐SVR to predict the impact of variants on exon skipping events by using the support vector regression. From the sequence of a particular exon and its flanking regions, 42 comprehensive features related to splicing events were extracted. By using a greedy feature selection algorithm, we found eight features contributing most to the prediction. The trained model achieved a Pearson correlation coefficient (PCC) of 0.570 in the 10‐fold cross‐validation based on the training data set provided by the “vex‐seq” challenge of the 5th Critical Assessment of Genome Interpretation. In the blind test also held by the challenge, our prediction ranked the 2nd with a PCC of 0.566 that demonstrates the robustness of our method. A further test indicated that the PredPSI‐SVR is helpful in prioritizing deleterious synonymous mutations. The method is available on https://github.com/chenkenbio/PredPSI-SVR.

show abstract

Identification of deleterious synonymous variants in human genomes

Cited by 12 publications

References 0 publications

Variation Interpretation Predictors: Principles, Types, Performance, and Choice

Variation Interpretation Predictors: Principles, Types, Performance, and Choice

On the Depth of Deep Learning Models for Splice Site Identification

Predicting the change of exon splicing caused by genetic variant using support vector regression

Contact Info

Product

Resources

About