The availability of a draft human genome sequence and ability to monitor the transcription of thousands of genes with DNA microarrays has necessitated the need for new computational tools that can analyze cis-regulatory elements controlling genes that display similar expression patterns. We have developed a tool designated EZ-Retrieve that can: (i) retrieve any particular region of human genome sequence from the NCBI database and (ii) analyze retrieved sequences for putative transcription factor-binding sites (TFBSs) as they appear on the TRANSFAC database. The tool is web-based, user-friendly and offers both batch sequence retrieval and batch TFBS prediction. A major application of EZ-Retrieve is the analysis of co-expressed genes that are highlighted as expression clusters in DNA microarray experiments.
A major challenge in the post-sequencing era is to elucidate the activity and biological function of genes that reside in the human genome. An important subset includes genes that encode proteins that regulate gene expression or maintain the structural integrity of the genome. Using a novel oligonucleotide-binding substrate as bait, we show the feasibility of a modified functional expression-cloning strategy to identify human cDNAs that encode a spectrum of nucleic acid-binding proteins (NBPs). Approximately 170 cDNAs were identified from screening phage libraries derived from a human colorectal adenocarcinoma cell line and from noncancerous fetal lung tissue. Sequence analysis confirmed that virtually every clone contained a known DNA- or RNA-binding motif. We also report on a complementary sorting strategy that, in the absence of subcloning and protein purification, can distinguish different classes of NBPs according to their particular binding properties. To extend our functional annotation of NBPs, we have used GeneChip expression profiling of 14 different breast-derived cell lines to examine the relative transcriptional activity of genes identified in our screen and cluster analysis to discover other genes that have similar expression patterns. Finally, we present strategies to analyze the upstream regulatory region of each gene within a cluster group and select unique combinations of transcription factor binding sites that may be responsible for dictating the observed synexpression.[The following individual kindly provided reagents, samples, or unpublished information as indicated in the paper: M. Stempher.]
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.