It has been shown that the Escherichia coli purE locus specifying 5'-phosphoribosyl-5-amino-4-imidazole cairboxylase in de novo purine nucleotide synthesis is divided into two cistrons. We cloned and determined a 2,449-nucleotide sequence including the purl? locus. This sequence contains two overlapped open reading frames, ORF-18 and ORF-39, encoding proteins with molecular weights of 18,000 and 39,000, respectively. The purE mutations of CSH57A and DCSP22 were complemented by plasmids carrying ORF-18, while that of NK6051 was complemetited by plasmids carrying ORF-39. Thus, the purE locus consists of two distinct genes, designated purE and purK for ORF-18 and ORF-39, respectively. These genes constitute a single operon. A highly conserved 16-nucleotide sequence, termed the PUR box, was found in the upstream region of purE by comparing the sequences of the purF and purMN operons. We also found three entire and one partial repetitive extragenic palindromic (REP) sequences in the downstream region of purK. Roles of the PUR box and REP sequences are discussed in relation to the genesis of the purEK operon.5'-Phosphoribosyl-5-amino-4-imidazole (AIR) carboxylase (EC 4.1.1.21) catalyzes the conversion of AIR to carboxyl AIR in de novo purine biosynthesis (17). In Escherichia coli, AIR carboxylase is encoded by the purE locus, which is located at 12 min on the chromosome (1). Genetic studies showed that the purE locus was divided into two complementation groups, purEl and purE2 (8, 12). However, it is not clear whether these two cistrons specify two distinct polypeptides or two functional domains in a single polypeptide.Hamilton and Reeve (9, 10) determined the nucleotide sequences of DNA fragments from Methanobrevibacter smithii and Methanobacterium thermoautotrophicum that were able to complement both purEl and purE2 mutations of E. coli. These sequences encoded a single polypeptide chain whose structure appeared to arise from the fusion of tandemly duplicated polypeptide chains. They also reported that a small deletion that occurred in the 3' domain of the M. thermoautotrophicum gene did not complement either the purE) or the purE2 mutation (10). On the other hand, seqtence analysis of a 12-purine gene cluster from Bacillus subtilis showed that two distinct genes, purE and purK, were responsible for the activity of AIR carboxylase (5). These data suggest a considerable variation in organization of the genes and operons of AIR carboxylase from one organism to another.The E. coli DNA fragment that complements purEl and purE2 mutations has been cloned (13; J. M. Smith, cited in reference 5) and sequenced (J. M. Smith, cited in reference 5). However, the details of the sequencing have not been reported. To explore the organization of the E. coli purE locus, we also cloned independently the chromosomal fragment of this region and determined the nucleotide sequence.In this report, we demonstrate that the E. coli purE locus consists of two overlapped genes in a single operon. We * Corresponding author. t Publication of thi...