Transposable elements and retroviruses are found in most genomes, can be pathogenic and are widely used as gene-delivery and functional genomics tools. Exploring whether these genetic elements target specific genomic sites for integration and how this preference is achieved is crucial to our understanding of genome evolution, somatic genome plasticity in cancer and ageing, host-parasite interactions and genome engineering applications. High-throughput profiling of integration sites by next-generation sequencing, combined with large-scale genomic data mining and cellular or biochemical approaches, has revealed that the insertions are usually non-random. The DNA sequence, chromatin and nuclear context, and cellular proteins cooperate in guiding integration in eukaryotic genomes, leading to a remarkable diversity of insertion site distribution and evolutionary strategies.
Highlights d High-throughput insertion site profiling of a LINE-1 (L1) element by ATLAS-seq d Insertion is influenced strongly by DNA sequence but only weakly by chromatin state d L1 integration preferences suggest a link with host DNA replication d Post-insertion selection reshapes L1 distribution across functional genomic regions
Transposable elements (TEs) are sequences currently or historically mobile, and are present across all eukaryotic genomes. A growing interest in understanding the regulation and function of TEs has revealed seemingly dichotomous roles for these elements in evolution, development, and disease. On the one hand, many gene regulatory networks owe their organization to the spread of cis‐elements and DNA binding sites through TE mobilization during evolution. On the other hand, the uncontrolled activity of transposons can generate mutations and contribute to disease, including cancer, while their increased expression may also trigger immune pathways that result in inflammation or senescence. Interestingly, TEs have recently been found to have novel essential functions during mammalian development. Here, the function and regulation of TEs are discussed, with a focus on LINE1 in mammals. It is proposed that LINE1 is a beneficial endogenous dual regulator of gene expression and genomic diversity during mammalian development, and that both of these functions may be detrimental if deregulated in disease contexts.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.