2005
DOI: 10.1186/1471-2105-6-68
|View full text |Cite
|
Sign up to set email alerts
|

Feature selection and nearest centroid classification for protein mass spectrometry

Abstract: Background: The use of mass spectrometry as a proteomics tool is poised to revolutionize early disease diagnosis and biomarker identification. Unfortunately, before standard supervised classification algorithms can be employed, the "curse of dimensionality" needs to be solved. Due to the sheer amount of information contained within the mass spectra, most standard machine learning techniques cannot be directly applied. Instead, feature selection techniques are used to first reduce the dimensionality of the inpu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
31
0

Year Published

2006
2006
2019
2019

Publication Types

Select...
7
3

Relationship

0
10

Authors

Journals

citations
Cited by 139 publications
(33 citation statements)
references
References 22 publications
0
31
0
Order By: Relevance
“…Wrapper [17] and filter [18] methods have also been used for biomarker selection, and vice versa: some intrinsic methods are used for pre-selection to provide input for other classification methods [19,20]. We would like to point out that statistical validation is of crucial importance in variable selection, as it is throughout the entire data analysis.…”
Section: Feature Selectionmentioning
confidence: 99%
“…Wrapper [17] and filter [18] methods have also been used for biomarker selection, and vice versa: some intrinsic methods are used for pre-selection to provide input for other classification methods [19,20]. We would like to point out that statistical validation is of crucial importance in variable selection, as it is throughout the entire data analysis.…”
Section: Feature Selectionmentioning
confidence: 99%
“…Apart from basic research that has benefited a lot from the fusion of computational intelligence into biological research as previously shown, molecular medicine and especially molecular diagnostics is a rapidly evolving field that incorporates high throughput technologies of proteomics [216][217][218][219][220][221][222][223][224][225][226][227][228][229] and genomics (e.g., MALDI, SELDI, microarrays). The basic question that these high tech biotechnological systems combined with the state-ofthe-art CI techniques try to solve is the discovery of efficient biomarkers for crucial diseases, contributing this way to the desired early diagnosis or even prognosis.…”
Section: Molecular Diagnostics: Proteomics and Genomicsmentioning
confidence: 99%
“…This leads the way into the field of supervised learning methods. A wide variety of techniques exist, some of which have been used in proteomic-based research; support vector machines (SVMs) [58], nearest centroid method [59], decision trees [60][61][62][63], artificial neural networks [64], logistic regression [63], linear discriminant analysis [63] and combinations of methods [65]. These methods all have their advantages and disadvantages, some of them can be easily interpreted to obtain biological knowledge (decision trees) others are more difficult to interpret and work as a black box classifier.…”
Section: Computational Data Analysismentioning
confidence: 99%