Chromosome structural variation (SV) is a normal part of variation in the human genome, but some classes of SV can cause neurodevelopmental disorders. Analysis of the DNA sequence at SV breakpoints can reveal mutational mechanisms and risk factors for chromosome rearrangement. Large-scale SV breakpoint studies have become possible recently owing to advances in next-generation sequencing (NGS) including whole-genome sequencing (WGS). These findings have shed light on complex forms of SV such as triplications, inverted duplications, insertional translocations, and chromothripsis. Sequence-level breakpoint data resolve SV structure and determine how genes are disrupted, fused, and/or misregulated by breakpoints. Recent improvements in breakpoint sequencing have also revealed non-allelic homologous recombination (NAHR) between paralogous long interspersed nuclear element (LINE) or human endogenous retrovirus (HERV) repeats as a cause of deletions, duplications, and translocations. This review covers the genomic organization of simple and complex constitutional SVs, as well as the molecular mechanisms of their formation.
Interpreting the genomic and phenotypic consequences of copy-number variation (CNV) is essential to understanding the etiology of genetic disorders. Whereas deletion CNVs lead obviously to haploinsufficiency, duplications might cause disease through triplosensitivity, gene disruption, or gene fusion at breakpoints. The mutational spectrum of duplications has been studied at certain loci, and in some cases these copy-number gains are complex chromosome rearrangements involving triplications and/or inversions. However, the organization of clinically relevant duplications throughout the genome has yet to be investigated on a large scale. Here we fine-mapped 184 germline duplications (14.7 kb-25.3 Mb; median 532 kb) ascertained from individuals referred for diagnostic cytogenetics testing. We performed next-generation sequencing (NGS) and whole-genome sequencing (WGS) to sequence 130 breakpoints from 112 subjects with 119 CNVs and found that most (83%) were tandem duplications in direct orientation. The remainder were triplications embedded within duplications (8.4%), adjacent duplications (4.2%), insertional translocations (2.5%), or other complex rearrangements (1.7%). Moreover, we predicted six in-frame fusion genes at sequenced duplication breakpoints; four gene fusions were formed by tandem duplications, one by two interconnected duplications, and one by duplication inserted at another locus. These unique fusion genes could be related to clinical phenotypes and warrant further study. Although most duplications are positioned head-to-tail adjacent to the original locus, those that are inverted, triplicated, or inserted can disrupt or fuse genes in a manner that might not be predicted by conventional copy-number assays. Therefore, interpreting the genetic consequences of duplication CNVs requires breakpoint-level analysis.
Unbalanced translocations are a relatively common type of copy number variation and a major contributor to neurodevelopmental disorders. We analyzed the breakpoints of 57 unique unbalanced translocations to investigate the mechanisms of how they form. Fifty-one are simple unbalanced translocations between two different chromosome ends, and six rearrangements have more than three breakpoints involving two to five chromosomes. Sequencing 37 breakpoint junctions revealed that simple translocations have between 0 and 4 base pairs (bp) of microhomology (n = 26), short inserted sequences (n = 8), or paralogous repeats (n = 3) at the junctions, indicating that translocations do not arise primarily from nonallelic homologous recombination but instead form most often via nonhomologous end joining or microhomology-mediated break-induced replication. Three simple translocations fuse genes that are predicted to produce in-frame transcripts of SIRPG-WWOX, SMOC2-PROX1, and PIEZO2-MTA1, which may lead to gain of function. Three complex translocations have inversions, insertions, and multiple breakpoint junctions between only two chromosomes. Whole-genome sequencing and fluorescence in situ hybridization analysis of two de novo translocations revealed at least 18 and 33 breakpoints involving five different chromosomes. Breakpoint sequencing of one maternally inherited translocation involving four chromosomes uncovered multiple breakpoints with inversions and insertions. All of these breakpoint junctions had 0-4 bp of microhomology consistent with chromothripsis, and both de novo events occurred on paternal alleles. Together with other studies, these data suggest that germline chromothripsis arises in the paternal genome and may be transmitted maternally. Breakpoint sequencing of our large collection of chromosome rearrangements provides a comprehensive analysis of the molecular mechanisms behind translocation formation.
Heat-resistant agglutinin 1 (Hra1) is an accessory colonization factor of enteroaggregative Escherichia coli (EAEC) strain 042. Tia, a close homolog of Hra1, is an invasin and adhesin that has been described in enterotoxigenic E. coli. We devised a PCR-restriction fragment length polymorphism screen for the associated genes and found that they occur among 55 (36.7%) of the enteroaggregative E. coli isolates screened, as well as lower proportions of enterotoxigenic, enteropathogenic, enterohemorrhagic, and commensal E. coli isolates. Overall, 25%, 8%, and 3% of 150 EAEC strains harbored hra1 alone, tia alone, or both genes, respectively. One EAEC isolate, 60A, produced an amplicon with a unique restriction profile, distinct from those of hra1 and tia. We cloned and sequenced the full-length agglutinin gene from strain 60A and have designated it hra2. The hra2 gene was not detected in any of 257 diarrheagenic E. coli isolates in our collection but is present in the genome of Salmonella enterica serovar Heidelberg strain SL476. The cloned hra2 gene from strain 60A, which encodes a predicted amino acid sequence that is 64% identical to that of Hra1 and 68% identical to that of Tia, was sufficient to confer adherence on E. coli K-12. We constructed an hra2 deletion mutant of EAEC strain 60A. The mutant was deficient in adherence but not autoaggregation or invasion, pointing to a functional distinction from the autoagglutinin Hra1 and the Tia invasin. Hra1, Tia, and the novel accessory adhesin Hra2 are members of a family of integral outer membrane proteins that confer different colonization-associated phenotypes.Enteroaggregative Escherichia coli (EAEC) strains are increasingly implicated in human diarrhea, especially among children living in developing countries (18,33). EAEC strains are exceptional colonizers and are defined by a characteristic stacked-brick adherence pattern (43). This aggregative pattern of adherence is a convergent phenotype produced in different lineages by a variety of adhesins, only some of which have been described (35). To date, most studies of EAEC adherence have focused on structural adhesins known as aggregative adherence fimbriae. However, recent research has shown that EAEC strains also harbor an expanding repertoire of nonstructural outer membrane proteins that contribute to colonization (1,4,14,28).One such outer membrane protein is heat-resistant agglutinin 1 (Hra1), an accessory adhesin that we recently characterized in EAEC strain 042 (1). The hra1 gene (along with its 90% identical allelic variant hek, reported from uropathogenic E. coli and neonatal meningitic E. coli [11,39]) is predicted to encode a 29-kDa precursor, which is processed to a 25-kDa outer membrane protein. The 792-bp hra1 gene is sufficient to confer agglutination of human erythrocytes, bacterial autoaggregation, enhanced biofilm formation, and aggregative adherence to cultured HEp-2 cells (1, 25). Hra1 shares 67% identity with the previously characterized outer membrane invasin and adhesin Tia (12, 13). The tia...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.