It is shown that the amino acid sequence and the DNA gene sequence of the 25 amino-terminal residues of the lac repressor protein of Escherichia coli are homologous with the sequences of five DNA-binding proteins: the cro repressor proteins from phage A and phage 434, the cI and cdi proteins from phage A, and the repressor protein from Salmonella phage P22. The region of homology between lac repressor and the other proteins coincides with the principal DNA-binding region ofcro repressor. In particular, residues Tyr-17 through Gln-26 oflac repressor correspond to the a-helix Gln-27 through Ala-36 of cro repressor, which we have postulated to bind within the major groove of the DNA Fig. 1 shows a comparison of the amino-terminal sequences of a series of proteins that bind to sequence-specific regions of double-stranded DNA (5). cro and 434-cro are small repressor proteins from bacteriophage A and bacteriophage 434, respectively (10-13). "cI" (often referred to as A repressor) and "P22" are larger repressor proteins from phage A and from Salmonella phage P22, respectively, that, under different circumstances, can mediate positive or negative control of gene transcription (14-18). "clI" is also larger than cro and, in conjunction with another protein (cIII), acts as a positive regulator of transcription in bacteriophage A (16,17,19,20). With the exception of cro and cI, these five proteins all recognize different sequences on the DNA.As can be seen in Fig. 1, there is extensive amino acid sequence homology between the five DNA-binding proteins. The correspondence between the five proteins can also be seen in the DNA gene sequences that code for the respective polypeptides, and, on the basis of this sequence homology, we have argued (5) that these proteins all have in common a region of tertiary structure corresponding to the segments labeled al, a2, and a3. In the cro structure, al and a2 are "structural" a-helices, whereas a3 is the "DNA recognition" a-helix, which we have postulated to lie within the major groove of B-form DNA and to be primarily responsible for the specific recognition of the DNA by the protein (4).A series ofcomparisons ofboth the amino acid sequence . (21,22) and the DNA coding sequence (23) indicates that the lac repressor protein from E. coli ("lac") may also have structural features in common with the other DNA-binding proteins. In Fig. 1 we have included the first 38 amino acids of lac, aligned to maximize the homology with the other proteins. The homology is most striking in the region 19-32 of cro (9-22 of lac), where Gly-24 is invariant and Ala-20 and Val-25. of cro, which are conserved in four of the five proteins, also occur in lac.The homology between lac and the other proteins can also be seen at the level of the DNA sequences that code for the respective polypeptides. The publication costs ofthis article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U. S. C. §1734 solely to indicate this f...