We previously demonstrated that sustainable enhanced levels of transgene products could be expressed from a bacterial DNA-free expression cassette either formed from a fragmented plasmid in mouse liver or delivered as a minicircle vector. This suggested that bacterial DNA sequences played a role in episomal transgene silencing. To further understand the silencing mechanism, we systematically altered the DNA components in both the expression cassette and the bacterial backbone, and compared the gene expression profiles from mice receiving different DNA forms. In nine vectors tested, animals that received the purified expression cassette alone always expressed persistently higher levels of transgene compared to 2fDNA groups. In contrast, animals that received linearized DNA by a single cut in the bacterial backbone had similar expression profiles to that of intact plasmid groups. All three linear DNAs formed large concatemers and small circles in mouse liver, while ccDNA remained intact. In all groups, the relative amount of vector DNA in liver remained similar. Together, these results further established that the DNA silencing effect was mediated by a covalent linkage of the expression cassette and the bacteria DNA elements.
The unmapped readspace of whole genome sequencing data tends to be large but is often ignored. We posit that it contains valuable signals of both human infection and contamination. Using unmapped and poorly aligned reads from whole genome sequences (WGS) of over 1000 families and nearly 5000 individuals, we present insights into common viral, bacterial, and computational contamination that plague whole genome sequencing studies. We present several notable results: (1) In addition to known contaminants such as Epstein-Barr virus and phiX, sequences from whole blood and lymphocyte cell lines contain many other contaminants, likely originating from storage, prep, and sequencing pipelines. (2) Sequencing plate and biological sample source of a sample strongly influence contamination profile. And, (3) Y-chromosome fragments not on the human reference genome commonly mismap to bacterial reference genomes. Both experiment-derived and computational contamination is prominent in next-generation sequencing data. Such contamination can compromise results from WGS as well as metagenomics studies, and standard protocols for identifying and removing contamination should be developed to ensure the fidelity of sequencing-based studies.
BackgroundThe unmapped readspace of whole genome sequencing data tends to be large but is often ignored. We posit that it contains valuable signals of both human infection and contamination. Using unmapped and poorly aligned reads from whole genome sequences (WGS) of over 1,000 families and 5,000 individuals, we present insights into common viral, bacterial, and computational contamination that plague whole genome sequencing studies.ResultsWe present several notable results: (1) In addition to known contaminants such as Epstein-Barr virus and phiX, sequences from whole blood and lymphocyte cell lines contain many other contaminants, likely originating from storage, prep, and sequencing pipelines. (2) Sequencing plate and biological sample source of a sample strongly influence contamination profile. And, (3) Y-chromosome fragments not on the human reference genome commonly mismap to bacterial reference genomes.ConclusionBoth experiment-derived and computational contamination is prominent in next-generation sequencing data. Such contamination can compromise results from WGS as well as metagenomics studies, and standard protocols for identifying and removing contamination should be developed to ensure the fidelity of sequencing-based studies.genomics WGS contamination
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.