2003
DOI: 10.1016/s0010-4825(02)00057-4
|View full text |Cite
|
Sign up to set email alerts
|

Identifying splicing sites in eukaryotic RNA: support vector machine approach

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
32
0

Year Published

2007
2007
2015
2015

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 55 publications
(32 citation statements)
references
References 16 publications
0
32
0
Order By: Relevance
“…Our model is based on Support Vector Machines (SVMs), which are supervised learning algorithms that, given a set of features and a binary classification (e.g., positive and negative cases), find the combination of features that provides an optimal separation between the instances of the two classes (see, e.g., Ben-Hur et al 2008). SVMs are widely used in computational biology and have been shown to achieve high accuracy in a variety of problems, including the prediction of splice sites (Sun et al 2003;Yamamura and Gotoh 2003;Zhang et al 2003;Sonnenburg et al 2007) and alternative exons (Dror et al 2005).…”
Section: Introductionmentioning
confidence: 99%
“…Our model is based on Support Vector Machines (SVMs), which are supervised learning algorithms that, given a set of features and a binary classification (e.g., positive and negative cases), find the combination of features that provides an optimal separation between the instances of the two classes (see, e.g., Ben-Hur et al 2008). SVMs are widely used in computational biology and have been shown to achieve high accuracy in a variety of problems, including the prediction of splice sites (Sun et al 2003;Yamamura and Gotoh 2003;Zhang et al 2003;Sonnenburg et al 2007) and alternative exons (Dror et al 2005).…”
Section: Introductionmentioning
confidence: 99%
“…In this paper, we totally consider 8 encoding methods: MCM [4], MCM with DTF, MCM with UTF, WAM [4], WAM with DTF [2], WAM with UTF and 4-bit [7]- [8], 16-bit [6] binary vector encoding. MCM and WAM encoding method only consider the information contained in true donor (resp.…”
Section: Performance Comparisonmentioning
confidence: 99%
“…Shortly after its introduction, its performance has already either matched or outperformed that of traditional machine learning approaches (e.g., NN) for a wide range of applications including splice sites prediction [2]- [7]. Currently, the SVM approach mainly deals with numerical data (with the exception of special kernel functions), so the DNA sequences must be encoded beforehand in some way.…”
Section: Introductionmentioning
confidence: 99%
“…2, February 2015 120 A Markov model is a model of discrete stochastic process that evolves through the states from the set S = {s 1 , s 2 , …, s n }. The main assumption is that the probability of appearance of any future state depends only on the k preceding states, for some constant k. Given a learning set of sequences, a Markov model can be built by computing the probability that a certain nucleotide x i appears after a sequence s i , for example,GeneMark family detects genes by identifying open reading frames (the regions between start and stop codons) using precomputed species-specific gene models as training data to determine parameters of the protein-coding and non-coding regions.The [19]. More than 90% of nucleotides can be correctly identified as either coding, or non-coding.…”
mentioning
confidence: 99%
“…Machine learning and data mining methods have been successfully applied to various kinds of prediction problems such as exon prediction [16], start codon prediction [17], and splice site prediction [18], [19]. More than 90% of nucleotides can be correctly identified as either coding, or non-coding.…”
mentioning
confidence: 99%