We show that intracellular transcription of G-rich regions produces novel DNA structures, visible by electron microscopy as large (150-500 bp) loops. These G-loops are formed cotranscriptionally, and they contain G4 DNA on one strand and a stable RNA/DNA hybrid on the other. G-loop formation requires a G-rich nontemplate strand and reflects the unusual stability of the rG/dC base pair. G-loops and G4 DNA form efficiently within plasmid genomes transcribed in vitro or in Escherichia coli. These results establish that G4 DNA can form in vivo, a finding with implications for stability and maintenance of all G-rich genomic regions.
BLM, the gene that is defective in Bloom's syndrome, encodes a protein homologous to RecQ subfamily helicases that functions as a 3-5 DNA helicase in vitro. We now report that the BLM helicase can unwind G4 DNA. The BLM G4 DNA unwinding activity is ATP-dependent and requires a short 3 region of single-stranded DNA. Strikingly, G4 DNA is a preferred substrate of the BLM helicase, as measured both by efficiency of unwinding and by competition. These results suggest that G4 DNA may be a natural substrate of BLM in vivo and that the failure to unwind G4 DNA may cause the genomic instability and increased frequency of sister chromatid exchange characteristic of Bloom's syndrome.
G-rich genomic regions can form G4 DNA upon transcription or replication. We have quantified the potential for G4 DNA formation (G4P) of the 16 654 genes in the human RefSeq database, and then correlated gene function with G4P. We have found that very low and very high G4P correlates with specific functional classes of genes. Notably, tumor suppressor genes have very low G4P and proto-oncogenes have very high G4P. G4P of these genes is evenly distributed between exons and introns, and it does not reflect enrichment for CpG islands or local chromosomal environment. These results show that genomic structure undergoes selection based on gene function. Selection based on G4P could promote genomic stability (or instability) of specific classes of genes; or reflect mechanisms for global regulation of gene expression.
Recent experiments provide fascinating examples of how G4 DNA and G4 RNA structures—aka quadruplexes—may contribute to normal biology and to genomic pathologies. Quadruplexes are transient and therefore difficult to identify directly in living cells, which initially caused skepticism regarding not only their biological relevance but even their existence. There is now compelling evidence for functions of some G4 motifs and the corresponding quadruplexes in essential processes, including initiation of DNA replication, telomere maintenance, regulated recombination in immune evasion and the immune response, control of gene expression, and genetic and epigenetic instability. Recognition and resolution of quadruplex structures is therefore an essential component of genome biology. We propose that G4 motifs and structures that participate in key processes compose the G4 genome, analogous to the transcriptome, proteome, or metabolome. This is a new view of the genome, which sees DNA as not only a simple alphabet but also a more complex geography. The challenge for the future is to systematically identify the G4 motifs that form quadruplexes in living cells and the features that confer on specific G4 motifs the ability to function as structural elements.
Recent advances have made a persuasive case for the existence of G4 DNA in living cells, but what--if any--are its functions? Experiments have established how G4 DNA may contribute to the biology of eukaryotic cells, and genomic analysis has identified new ways in which the potential to form G4 DNA may influence gene regulation and genomic stability. This Perspective highlights those advances and identifies some key open questions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.