The ETS gene family is frequently involved in chromosome translocations that cause human cancer, including prostate cancer, leukemia, and sarcoma. However, the mechanisms by which oncogenic ETS proteins, which are DNA-binding transcription factors, target genes necessary for tumorigenesis is not well understood. Ewing's sarcoma serves as a paradigm for the entire class of ETS-associated tumors because nearly all cases harbor recurrent chromosomal translocations involving ETS genes. The most common translocation in Ewing's sarcoma encodes the EWS/FLI oncogenic transcription factor. We used whole genome localization (ChIP-chip) to identify target genes that are directly bound by EWS/FLI. Analysis of the promoters of these genes demonstrated a significant over-representation of highly repetitive GGAA-containing elements (microsatellites). In a parallel approach, we found that EWS/FLI uses GGAA microsatellites to regulate the expression of some of its target genes including NR0B1, a gene required for Ewing's sarcoma oncogenesis. The microsatellite in the NR0B1 promoter bound EWS/FLI in vitro and in vivo and was both necessary and sufficient to confer EWS/FLI regulation to a reporter gene. Genome wide computational studies demonstrated that GGAA microsatellites were enriched close to EWS/FLI-up-regulated genes but not down-regulated genes. Mechanistic studies demonstrated that the ability of EWS/FLI to bind DNA and modulate gene expression through these repetitive elements depended on the number of consecutive GGAA motifs. These findings illustrate an unprecedented route to specificity for ETS proteins and use of microsatellites in tumorigenesis.
The conservation of in vitro DNA-binding properties within families of transcription factors presents a challenge for achieving in vivo specificity. To uncover the mechanisms regulating specificity within the ETS gene family, we have used chromatin immunoprecipitation coupled with genome-wide promoter microarrays to query the occupancy of three ETS proteins in a human T-cell line. Unexpectedly, redundant occupancy was frequently detected, while specific occupancy was less likely. Redundant binding correlated with housekeeping classes of genes, whereas specific binding examples represented more specialized genes. Bioinformatics approaches demonstrated that redundant binding correlated with consensus ETS-binding sequences near transcription start sites. In contrast, specific binding sites diverged dramatically from the consensus and were found further from transcription start sites. One route to specificity was found-a highly divergent binding site that facilitates ETS1 and RUNX1 cooperative DNA binding. The specific and redundant DNA-binding modes suggest two distinct roles for members of the ETS transcription factor family.[Keywords: ETS; transcription; gene families; cooperative binding; promoter specificity; ChIP-chip] Supplemental material is available at www.genesdev.org.
We have implemented a method that identifies the genomic origins of sample proteins by scanning their peptide-mass fingerprint against the theoretical translation and proteolytic digest of an entire genome. Unlike previously reported techniques, this method requires no predefined ORF or protein annotations. Fixedsize windows along the genome sequence are scored by an equation accounting for the number of matching peptides, the number of missed enzymatic cleavages in each peptide, the number of in-frame stop codons within a window, the adjacency between peptides, and duplicate peptide matches. Statistical significance of matching regions is assessed by comparing their scores to scores from windows matching randomly generated mass data. Tests with samples from Saccharomyces cerevisiae mitochondria and Escherichia coli have demonstrated the ability to produce statistically significant identifications, agreeing with two commonly used programs, PEPTIDENT and MASCOT, in 86% of samples analyzed. This genome fingerprint scanning method has the potential to aid in genome annotation, identify proteins for which annotation is incorrect or missing, and handle cases where sequencing errors have caused framing mistakes in the databases. It might also aid in the identification of proteins in which recoding events such as frameshifting or stop-codon read-through have occurred, elucidating alternative translation mechanisms. The prototype is implemented as a client͞server pair, allowing the distribution, among a set of cluster nodes, of a single or multiple genomes for concurrent analysis.
An mRNA transcript contains many potential antisense oligodeoxynucleotide target sites. Identification of the most efficacious targets remains an important and challenging problem. Building on separate work that revealed a strong correlation between the inclusion of short sequence motifs and the activity level of an oligo, we have developed a predictive artificial neural network system for mapping tetranucleotide motif content to antisense oligo activity. Trained for high-specificity prediction, the system has been cross-validated against a database of 348 oligos from the literature and a larger proprietary database of 908 oligos. In cross- validation tests the system identified effective oligos (i.e. oligos capable of reducing target mRNA expression to <25% that of the control) with 53% accuracy, in contrast to the <10% success rates commonly reported for trial-and-error oligo selection, suggesting a possible 5-fold reduction in the in vivo screening required to find an active oligo. We have implemented a web interface to a trained neural network. Given an RNA transcript as input, the system identifies the most likely oligo targets and provides estimates of the probabilities that oligos targeted against these sites will be effective.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.