2002
DOI: 10.1002/0471250953.bi0305s00
|View full text |Cite
|
Sign up to set email alerts
|

Selecting the Right Protein‐Scoring Matrix

Abstract: Every program for searching protein sequences against a database includes a choice of a protein weight matrix, also called a scoring matrix. Weight matrices add sensitivity to the search, while statistical significance adds selectivity. Virtually every user chooses the default, typically PAM 250 or BLOSUM62. Despite the fact that the choice of matrix can strongly influence the outcome of the analysis, most users do not know why a particular matrix should be used. In general, scoring matrices implicitly represe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
11
0

Year Published

2009
2009
2021
2021

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 15 publications
(11 citation statements)
references
References 9 publications
0
11
0
Order By: Relevance
“…Therefore, at superfamily level, SCOP provides an excellent benchmark for testing how algorithms perform in cases, in which some related proteins have very low sequence similarity. Distance measure between two sequences is computed by the N-W alignment algorithm and PAM50 [10] mutation probability matrix. The distances is calculated between every pair of protein sequences in the dataset and stored in a square matrix of N×N, where N is the number proteins in a dataset to be clustered.…”
Section: Resultsmentioning
confidence: 99%
“…Therefore, at superfamily level, SCOP provides an excellent benchmark for testing how algorithms perform in cases, in which some related proteins have very low sequence similarity. Distance measure between two sequences is computed by the N-W alignment algorithm and PAM50 [10] mutation probability matrix. The distances is calculated between every pair of protein sequences in the dataset and stored in a square matrix of N×N, where N is the number proteins in a dataset to be clustered.…”
Section: Resultsmentioning
confidence: 99%
“…being n ξ the minimal length of both subsequences under consideration, and D D D(ξ ξ ξ(l), ν ν ν(l)) the value of the scoring matrix for the respective l-th elements of ξ ξ ξ and ν ν ν. As scoring matrix D D D, the Point Accepted Mutation (PAM250) is used for the pairwise local alignment, as recommended in (Wheeler, 2002).…”
Section: Dissimilarity Space Representationmentioning
confidence: 99%
“…The main ones are the so-called PAM and BLOSUM (Wheeler, 2003). The most widely used PAM matrix is PAM 250.…”
Section: Scoring Matrices (I) Pam Vs Blosummentioning
confidence: 99%
“…Thus, the PAM250 matrix is derived by multiplying the PAM1 matrix against itself 250 times. Biologically, the PAM250 matrix means there have been 2.5 amino acid replacements at each site (Wheeler, 2003). In the derivation of PAM matrices, sequences that were represented many times were not excluded from the calculation.…”
Section: Scoring Matrices (I) Pam Vs Blosummentioning
confidence: 99%
See 1 more Smart Citation