Gene expression data from microarrays are being applied to predict preclinical and clinical endpoints, but the reliability of these predictions has not been established. In the MAQC-II project, 36 independent teams analyzed six microarray data sets to generate predictive models for classifying a sample with respect to one of 13 endpoints indicative of lung or liver toxicity in rodents, or of breast cancer, multiple myeloma or neuroblastoma in humans. In total, >30,000 models were built using many combinations of analytical methods. The teams generated predictive models without knowing the biological meaning of some of the endpoints and, to mimic clinical reality, tested the models on data that had not been used for training. We found that model performance depended largely on the endpoint and team proficiency and that different approaches generated models of similar performance. The conclusions and recommendations from MAQC-II should be useful for regulatory agencies, study committees and independent investigators that evaluate methods for global gene expression analysis.
Organism surfaces represent signaling sites for attraction of allies and defense against enemies. However, our understanding of these signals has been impeded by methodological limitations that have precluded direct fine-scale evaluation of compounds on native surfaces. Here, we asked whether natural products from the red macroalga Callophycus serratus act in surface-mediated defense against pathogenic microbes. Bromophycolides and callophycoic acids from algal extracts inhibited growth of Lindra thalassiae, a marine fungal pathogen, and represent the largest group of algal antifungal chemical defenses reported to date. Desorption electrospray ionization mass spectrometry (DESI-MS) imaging revealed that surface-associated bromophycolides were found exclusively in association with distinct surface patches at concentrations sufficient for fungal inhibition; DESI-MS also indicated the presence of bromophycolides within internal algal tissue. This is among the first examples of natural product imaging on biological surfaces, suggesting the importance of secondary metabolites in localized ecological interactions, and illustrating the potential of DESI-MS in understanding chemically-mediated biological processes.imaging mass spectrometry ͉ macroalga ͉ natural product ͉ surface-associated
In the clinical application of genomic data analysis and modeling, a number of factors contribute to the performance of disease classification and clinical outcome prediction. This study focuses on the k-nearest neighbor (KNN) modeling strategy and its clinical use. Although KNN is simple and clinically appealing, large performance variations were found among experienced data analysis teams in the MicroArray Quality Control Phase II (MAQC-II) project. For clinical end points and controls from breast cancer, neuroblastoma and multiple myeloma, we systematically generated 463 320 KNN models by varying feature ranking method, number of features, distance metric, number of neighbors, vote weighting and decision threshold. We identified factors that contribute to the MAQC-II project performance variation, and validated a KNN data analysis protocol using a newly generated clinical data set with 478 neuroblastoma patients. We interpreted the biological and practical significance of the derived KNN models, and compared their performance with existing clinical factors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.