Over the last decade, the introduction of microarray technology has had a profound impact on gene expression research. The publication of studies with dissimilar or altogether contradictory results, obtained using different microarray platforms to analyze identical RNA samples, has raised concerns about the reliability of this technology. The MicroArray Quality Control (MAQC) project was initiated to address these concerns, as well as other performance and data analysis issues. Expression data on four titration pools from two distinct reference RNA samples were generated at multiple test sites using a variety of microarray-based and alternative technology platforms. Here we describe the experimental design and probe mapping efforts behind the MAQC project. We show intraplatform consistency across test sites as well as a high level of interplatform concordance in terms of genes identified as differentially expressed. This study provides a resource that represents an important first step toward establishing a framework for the use of microarrays in clinical and regulatory settings.
Gene expression data from microarrays are being applied to predict preclinical and clinical endpoints, but the reliability of these predictions has not been established. In the MAQC-II project, 36 independent teams analyzed six microarray data sets to generate predictive models for classifying a sample with respect to one of 13 endpoints indicative of lung or liver toxicity in rodents, or of breast cancer, multiple myeloma or neuroblastoma in humans. In total, >30,000 models were built using many combinations of analytical methods. The teams generated predictive models without knowing the biological meaning of some of the endpoints and, to mimic clinical reality, tested the models on data that had not been used for training. We found that model performance depended largely on the endpoint and team proficiency and that different approaches generated models of similar performance. The conclusions and recommendations from MAQC-II should be useful for regulatory agencies, study committees and independent investigators that evaluate methods for global gene expression analysis.
Understanding structural requirements for a chemical to exhibit estrogen receptor (ER) binding has been important in various fields. This knowledge has been directly and indirectly applied to design drugs for human estrogen replacement therapy, and to identify estrogenic endocrine disruptors. This paper reports structure-activity relationships (SARs) based on a total of 230 chemicals, including both natural and xenoestrogens. Activities were generated using a validated ER competitive binding assay, which covers a 10 6 -fold range. This study is focused on identification of structural commonalities among diverse ER ligands. It provides an overall picture of how xenoestrogens structurally resemble endogenous 17 -estradiol (E 2 ) and the synthetic estrogen diethylstilbestrol (DES). On the basis of SAR analysis, five distinguishing criteria were found to be essential for xenoestrogen activity, using E 2 as a template: (1) H-bonding ability of the phenolic ring mimicking the 3-OH, (2) H-bond donor mimicking the17 -OH and O-O distance between 3-and 17 -OH, (3) precise steric hydrophobic centers mimicking steric 7R-and 11 -substituents, (4) hydrophobicity, and (5) a ring structure. The 3-position H-bonding ability of phenols is a significant requirement for ER binding. This contributes as both a H-bond donor and acceptor, although predominantly as a donor. However, the 17 -OH contributes as a H-bond donor only. The precise space (the size and orientation) of steric hydrophobic bulk groups is as important as a 17 -OH. Where a direct comparison can be made, strong estrogens tend to be more hydrophobic. A rigid ring structure favors ER binding. The knowledge derived from this study is rationalized into a set of hierarchical rules that will be useful in guidance for identification of potential estrogens.
A number of environmental and industrial chemicals are reported to possess androgenic or antiandrogenic activities. These androgenic endocrine disrupting chemicals may disrupt the endocrine system of humans and wildlife by mimicking or antagonizing the functions of natural hormones. The present study developed a low cost recombinant androgen receptor (AR) competitive binding assay that uses no animals. We validated the assay by comparing the protocols and results from other similar assays, such as the binding assay using prostate cytosol. We tested 202 natural, synthetic, and environmental chemicals that encompass a broad range of structural classes, including steroids, diethylstilbestrol and related chemicals, antiestrogens, flutamide derivatives, bisphenol A derivatives, alkylphenols, parabens, alkyloxyphenols, phthalates, siloxanes, phytoestrogens, DDTs, PCBs, pesticides, organophosphate insecticides, and other chemicals. Some of these chemicals are environmentally persistent and/or commercially important, but their AR binding affinities have not been previously reported. To the best of our knowledge, these results represent the largest and most diverse data set publicly available for chemical binding to the AR. Through a careful structure-activity relationship (SAR) examination of the data set in conjunction with knowledge of the recently reported ligand-AR crystal structures, we are able to define the general structural requirements for chemical binding to AR. Hydrophobic interactions are important for AR binding. The interaction between ligand and AR at the 3- and 17-positions of testosterone and R1881 found in other chemical classes are discussed in depth. The SAR studies of ligand binding characteristics for AR are compared to our previously reported results for estrogen receptor binding.
Research applications in chemoinformatics and toxicoinformatics increasingly use representations of molecules in the form of numerical descriptors that capture the structural characteristics and properties of molecules. These representations are useful for ADME/toxicity prediction, diversity analysis, library design, QSAR/QSPR, virtual screening, and other purposes. Molecular descriptors have ranged from relatively simple forms calculated from simple two-dimensional (2D) chemical structures to more complex forms representing three-dimensional (3D) chemical structures or complex molecular fingerprints consisting of numerous bit positions to represent specific chemical information. The Mold (2) software was developed to enable the rapid calculation of a large and diverse set of descriptors encoding two-dimensional chemical structure information. Comparative analysis of Mold (2) descriptors with those calculated by Cerius (2), Dragon, and Molconn-Z on several data sets using Shannon entropy analysis demonstrated that Mold (2) descriptors convey a similar amount of information. In addition, using the same classification method, slightly better models were generated using Mold (2) descriptors compared to those generated using descriptors from the compared commercial software packages. The low computing cost for Mold (2) makes it suitable not only for small data sets, such as in QSAR, but also for large databases in virtual screening. High reproducibility and reliability are expected because Mold (2) does not require 3D structures. Mold (2) is freely available to the public ( http://www.fda.gov/nctr/science/centers/toxicoinformatics/index.htm).
Background: Reproducibility is a fundamental requirement in scientific experiments. Some recent publications have claimed that microarrays are unreliable because lists of differentially expressed genes (DEGs) are not reproducible in similar experiments. Meanwhile, new statistical methods for identifying DEGs continue to appear in the scientific literature. The resultant variety of existing and emerging methods exacerbates confusion and continuing debate in the microarray community on the appropriate choice of methods for identifying reliable DEG lists.
The techniques of combining the results of multiple classification models to produce a single prediction have been investigated for many years. In earlier applications, the multiple models to be combined were developed by altering the training set. The use of these so-called resampling techniques, however, poses the risk of reducing predictivity of the individual models to be combined and/or over fitting the noise in the data, which might result in poorer prediction of the composite model than the individual models. In this paper, we suggest a novel approach, named Decision Forest, that combines multiple Decision Tree models. Each Decision Tree model is developed using a unique set of descriptors. When models of similar predictive quality are combined using the Decision Forest method, quality compared to the individual models is consistently and significantly improved in both training and testing steps. An example will be presented for prediction of binding affinity of 232 chemicals to the estrogen receptor.
Background: The acceptance of microarray technology in regulatory decision-making is being challenged by the existence of various platforms and data analysis methods. A recent report (E. Marshall, Science, 306, 630-631, 2004), by extensively citing the study of Tan et al. (Nucleic Acids Res., 31, 5676-5684, 2003), portrays a disturbingly negative picture of the cross-platform comparability, and, hence, the reliability of microarray technology.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.