A common task posed by microarray experiments is to infer the binding site preferences for a known transcription factor from a collection of genes that it regulates and to ascertain whether the factor acts alone or in a complex. The converse problem can also be posed: Given a collection of binding sites, can the regulatory factor or complex of factors be inferred? Both tasks are substantially facilitated by using relatively simple homology models for protein-DNA interactions, as well as the rapidly expanding protein structure database. For budding yeast, we are able to construct reliable structural models for 67 transcription factors and with them redetermine factor binding sites by using a Bayesian Gibbs sampling algorithm and an extensive protein localization data set. For 49 factors in common with a prior analysis of this data set (based largely on phylogenetic conservation), we find that half of the previously predicted binding motifs are in need of some revision. We also solve the inverse problem of ascertaining the factors from the binding sites by assigning a correct protein fold to 25 of the 49 cases from a previous study. Our approach is easily extended to other organisms, including higher eukaryotes. Our study highlights the utility of enlarging current structural genomics projects that exhaustively sample fold structure space to include all factors with significantly different DNA-binding specificities.protein-DNA interactions ͉ homology models of transcription factors ͉ weight matrix predictions T ranscription factors (TFs) are regulatory proteins used by the cell to activate or repress gene transcription. They interact with short nucleotide sequences, typically located upstream of a gene, by means of the DNA-binding domains that recognize their cognate binding sites. As a rule, regulation of gene transcription is analyzed by the bioinformatics methods designed to detect statistically overrepresented motifs in promoter sequences. Intergenic sequences bound by the TF can be identified by using DNA microarray technology, including chromatin immunoprecipitation (ChIPchip) (1, 2), protein binding (3), and DNA immunoprecipitation (DIP-chip) arrays (4). Of special note is a recent genome-wide study that used ChIP-chip analysis to profile in vivo genomic occupancies for 203 DNA-binding transcriptional regulators in Saccharomyces cerevisiae (2). Using these data, the authors predicted binding specificities for 65 TFs by using the genomes of related species; a number that was later increased to 98 by MacIsaac et al. (5) The DNA-binding domains of TFs can be classified into a limited number of structural families (6, 7). Structural studies of the protein-DNA complexes reveal that, within each family, the overall fold of the DNA-binding domain and its mode of interaction with the cognate binding site are remarkably conserved, resulting in a characteristic pattern of amino acid contacts with DNA bases. These interactions form the basis of the sequence-specific direct readout of nucleotide sequences by amino acid...