Over the last decade, the introduction of microarray technology has had a profound impact on gene expression research. The publication of studies with dissimilar or altogether contradictory results, obtained using different microarray platforms to analyze identical RNA samples, has raised concerns about the reliability of this technology. The MicroArray Quality Control (MAQC) project was initiated to address these concerns, as well as other performance and data analysis issues. Expression data on four titration pools from two distinct reference RNA samples were generated at multiple test sites using a variety of microarray-based and alternative technology platforms. Here we describe the experimental design and probe mapping efforts behind the MAQC project. We show intraplatform consistency across test sites as well as a high level of interplatform concordance in terms of genes identified as differentially expressed. This study provides a resource that represents an important first step toward establishing a framework for the use of microarrays in clinical and regulatory settings.
We have evaluated the performance characteristics of three quantitative gene expression technologies and correlated their expression measurements to those of five commercial microarray platforms, based on the MicroArray Quality Control (MAQC) data set. The limit of detection, assay range, precision, accuracy and fold-change correlations were assessed for 997 TaqMan Gene Expression Assays, 205 Standardized RT (Sta)RT-PCR assays and 244 QuantiGene assays. TaqMan is a registered trademark of Roche Molecular Systems, Inc. We observed high correlation between quantitative gene expression values and microarray platform results and found few discordant measurements among all platforms. The main cause of variability was differences in probe sequence and thus target location. A second source of variability was the limited and variable sensitivity of the different microarray platforms for detecting weakly expressed genes, which affected interplatform and intersite reproducibility of differentially expressed genes. From this analysis, we conclude that the MAQC microarray data set has been validated by alternative quantitative gene expression platforms thus supporting the use of microarray platforms for the quantitative characterization of gene expression.
Gene expression data from microarrays are being applied to predict preclinical and clinical endpoints, but the reliability of these predictions has not been established. In the MAQC-II project, 36 independent teams analyzed six microarray data sets to generate predictive models for classifying a sample with respect to one of 13 endpoints indicative of lung or liver toxicity in rodents, or of breast cancer, multiple myeloma or neuroblastoma in humans. In total, >30,000 models were built using many combinations of analytical methods. The teams generated predictive models without knowing the biological meaning of some of the endpoints and, to mimic clinical reality, tested the models on data that had not been used for training. We found that model performance depended largely on the endpoint and team proficiency and that different approaches generated models of similar performance. The conclusions and recommendations from MAQC-II should be useful for regulatory agencies, study committees and independent investigators that evaluate methods for global gene expression analysis.
To validate and extend the findings of the MicroArray Quality Control (MAQC) project, a biologically relevant toxicogenomics data set was generated using 36 RNA samples from rats treated with three chemicals (aristolochic acid, riddelliine and comfrey) and each sample was hybridized to four microarray platforms. The MAQC project assessed concordance in intersite and cross-platform comparisons and the impact of gene selection methods on the reproducibility of profiling data in terms of differentially expressed genes using distinct reference RNA samples. The real-world toxicogenomic data set reported here showed high concordance in intersite and cross-platform comparisons. Further, gene lists generated by fold-change ranking were more reproducible than those obtained by t-test P value or Significance Analysis of Microarrays. Finally, gene lists generated by fold-change ranking with a nonstringent P-value cutoff showed increased consistency in Gene Ontology terms and pathways, and hence the biological impact of chemical exposure could be reliably deduced from all platforms analyzed.
The first formal qualification of safety biomarkers for regulatory decision making marks a milestone in the application of biomarkers to drug development. Following submission of drug toxicity studies and analyses of biomarker performance to the Food and Drug Administration (FDA) and European Medicines Agency (EMEA) by the Predictive Safety Testing Consortium's (PSTC) Nephrotoxicity Working Group, seven renal safety biomarkers have been qualified for limited use in nonclinical and clinical drug development to help guide safety assessments. This was a pilot process, and the experience gained will both facilitate better understanding of how the qualification process will probably evolve and clarify the minimal requirements necessary to evaluate the performance of biomarkers of organ injury within specific contexts.
Quantification of residual disease by real-time polymerase chain reaction (PCR) will become a pivotal tool in the development of patient-directed therapy. In recent years, various protocols to quantify minimal residual disease in leukemia or lymphoma patients have been developed. These assays assume that PCR efficiencies are equal for all samples. Determining t(14;18) and albumin reaction efficiencies for sixteen follicular lymphoma patient samples revealed higher efficiencies for blood samples than for lymph node samples in general. However, within one sample both reactions had equivalent efficiencies. Differences in amplification efficiencies between patient samples (low efficiencies) and the calibrator in quantitative analyses result in the underestimation of residual disease in patient samples whereby the weakest positive patient samples are at highest error. Based on these findings for patient samples, the efficiency compensation control was developed. This control includes two reference reactions in a multiplex setting, specific for the beta-actin and albumin housekeeping genes that are present in a constant ratio within DNA templates. The difference in threshold cycle values for both reference reactions, ie, the Ct(2) value, is dependent on the amplification efficiency, and is used to compensate for efficiency differences between patient samples and the calibrator. The beta-actin reference reaction is also used to normalize for DNA input. Furthermore, the efficiency compensation control facilitates identification of patient samples that are so contaminated with PCR inhibitory compounds that different amplification reactions are affected to a different extent. Accurate quantitation of residual disease in these samples is therefore impossible with the current quantitative real-time PCR protocols. Identification and exclusion of these inadequate samples will be of utmost importance in quantitative retrospective studies, but even more so, in future molecular diagnostic analyses.
Background: Reproducibility is a fundamental requirement in scientific experiments. Some recent publications have claimed that microarrays are unreliable because lists of differentially expressed genes (DEGs) are not reproducible in similar experiments. Meanwhile, new statistical methods for identifying DEGs continue to appear in the scientific literature. The resultant variety of existing and emerging methods exacerbates confusion and continuing debate in the microarray community on the appropriate choice of methods for identifying reliable DEG lists.
Background: The acceptance of microarray technology in regulatory decision-making is being challenged by the existence of various platforms and data analysis methods. A recent report (E. Marshall, Science, 306, 630-631, 2004), by extensively citing the study of Tan et al. (Nucleic Acids Res., 31, 5676-5684, 2003), portrays a disturbingly negative picture of the cross-platform comparability, and, hence, the reliability of microarray technology.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.