The complete sequences of the RhsB and RhsC elements of Escherichia coli K-12 have been determined. These sequence data reveal a new repeated sequence, called H-rpt (Hinc repeat), which is distinct from the Rhs core repetition that is found in all five Rhs elements. H-rpt is found in RhsB, RhsC, and RhsE. Characterization of H-rpt supports the view that the Rhs elements are composite structures assembled from components with very different evolutionary histories and that their incorporation into the E. coli genome is relatively recent. In each case, H-rpt is found downstream from the Rhs core and is separated from the core by a segment of DNA that is unique to the individual element. The H-rpt's of RhsB and RhsE are very similar, diverging by only 2.1%. They are 1,291 bp in length, and each contains an 1,134-bp open reading frame (ORF). RhsC has three tandem copies of H-rpt, all of which appear defective in that they are large deletions and/or have the reading frame interrupted. Features of H-rpt are analogous to features typical of insertion sequences; however, no associated transposition activity has been detected. A 291-bp fragment of H-rpt is found near min 5 of the E. coli K-12 map and is not associated with any Rhs core homology. The complete core sequences of RhsB and RhsC have been compared with that of RhsA. As anticipated, the three core sequences are closely related, all having identical lengths of 3,714 bp each. Like RhsA, the RhsB and RhsC cores constitute single ORFs that begin with the first core base. In each case, the core ORF extends beyond the core into the unique sequence. Of the three cores, RhsB and RhsA are the most similar, showing only 0.9% sequence divergence, while RhsB and RhsC are the least similar, diverging by 2.9%. All three cores conserve the 28 repetitions of a peptide motif noted originally for RhsA. A secondary structure is proposed for this motif, and the possibility of its having an extracellular binding function is discussed. RhsB contains one additional unique ORF, and RhsC contains two additional unique ORFs. One of these ORFs includes a signal peptide that is functional when fused to TnphoA.
The complete nucleotide sequence of the rhsA locus and selected portions of other members of the rhs multigene family of Escherichia coli K-12 have been determined. A definition of the limits of the rhsA and rhsC loci was established by comparing sequences from E. coli K-12 with sequences from an independent E. coli isolate whose DNA contains no homology to the rhs core. This comparison showed that rhsA comprises 8,249 base pairs (bp) in strain K-12 and that the Rhs°strain, instead, contains an unrelated 32-bp sequence.Similarly, the K-12 rhsC locus is 9.6 kilobases in length and a 10-bp sequence resides at its location in the Rhso Families of homologous but nonidentical genes present a number of special genetic questions. These questions include the possible specific roles of individual members, the degree to which members exchange heterologies through intrachromosomal recombination, and the effects of these exchanges on function. The presence of these multigene families also has important implications for chromosome rearrangement and evolution (17). However, except for the rrn operons encoding ribosomal RNA, multigene families are quite rare in Escherichia coli (19). We have recently reported an unusual and complex family, the rhs family, that is comparable to the rrn family in number, length of shared homology, and degree of sequence similarity. The rhs loci were originally detected through their action as rearrangement hot spots, providing homology for recA-dependent intrachromosomal recombination (4, 11). Consequently, the rhs loci were defined to include the homologous sequences shared by two or more of the respective loci. Four rhs loci of E. coli strain K-12 have been characterized extensively, and evidence for a fifth has been noted (4,11,20).A distinctive feature of the rhs loci is that they share a highly conserved 3.7-kilobase (kb) core sequence. The cores are generally flanked by dissimilar sequences, and for two of the loci, rhsA and rhsC, one or more partial core repetitions are present downstream from the intact core. Sequence comparison of the first 300 nucleotides of the rhs cores (20) revealed that the core homology begins precisely with a start codon initiating an open reading frame (ORF) and that the rhsA, rhsB, and rhsC cores are closely related, showing only 1 to 2% sequence divergence. By contrast, rhsD is 18% divergent from the others. A total of nine mismatches distinguish rhsA, rhsB, and rhsC through these 300 nucleotides, but none of the nine causes an amino acid substitution in the core ORFs. However, rhsD differs from the others by eight amino acids. This degree of divergence through predominantly neutral mutation indicates that the cores have been evolving independently for quite some time. Application of the mutation rate estimated for enteric bacteria by Ochman and Wilson (15) suggests that the rhsA and rhsC cores diverged on the order of 10 million years ago, and the extent of sequence divergence of rhsD would indicate that it radiated from the others on the order of 100 mill...
The Rhs family of composite genetic elements was assessed for variation among independent Escherichia coli strains of the ECOR reference collection. The location and content of the RhsA-B-C-F subfamily correlates highly with the clonal structure of the ECOR collection. This correlation exists at several levels: the presence of Rhs core homology in the strain, the location of the Rhs elements present, and the identity of the Rhs core-extensions associated with each element. A provocative finding was that an identical 1518-bp segment, covering core-extension-b1 and its associated downstream open reading frame, is present in two distinct clonal groups, but in association with different Rhs elements. The sequence identity of this segment when contrasted with the divergence of other chromosomal segments suggests that shuffling of Rhs core extensions has been a relatively recent variation. Nevertheless the copies of core-extension-b1 were placed within the respective Rhs elements before the emergence of the clonal groups. In the course of this analysis, two new Rhs elements absent from E. coli K-12 were discovered: RhsF, a fourth member of the RhsA-B-C-F subfamily, and RhsG, the prototype of a third Rhs subfamily.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.