FaBox is a collection of simple and intuitive web services that enable biologists and medical researchers to quickly perform typical task with sequence data. The services makes it easy to extract, edit, and replace sequence headers and join or divide data sets based on header information. Other services include collapsing a set of sequences into haplotypes and automated formatting of input files for a number of population genetics programs, such as ARLEQUIN , TCS and MRBAYES . The toolbox is expected to grow on the basis of requests for particular services and converters in the future. FaBox is freely available at www.birc.au.dk/fabox.
Spiders are ecologically important predators with complex venom and extraordinarily tough silk that enables capture of large prey. Here we present the assembled genome of the social velvet spider and a draft assembly of the tarantula genome that represent two major taxonomic groups of spiders. The spider genomes are large with short exons and long introns, reminiscent of mammalian genomes. Phylogenetic analyses place spiders and ticks as sister groups supporting polyphyly of the Acari. Complex sets of venom and silk genes/proteins are identified. We find that venom genes evolved by sequential duplication, and that the toxic effect of venom is most likely activated by proteases present in the venom. The set of silk genes reveals a highly dynamic gene evolution, new types of silk genes and proteins, and a novel use of aciniform silk. These insights create new opportunities for pharmacological applications of venom and biomaterial applications of silk.
Building a population-specific catalogue of single nucleotide variants (SNVs), indels and structural variants (SVs) with frequencies, termed a national pan-genome, is critical for further advancing clinical and public health genetics in large cohorts. Here we report a Danish pan-genome obtained from sequencing 10 trios to high depth (50 × ). We report 536k novel SNVs and 283k novel short indels from mapping approaches and develop a population-wide de novo assembly approach to identify 132k novel indels larger than 10 nucleotides with low false discovery rates. We identify a higher proportion of indels and SVs than previous efforts showing the merits of high coverage and de novo assembly approaches. In addition, we use trio information to identify de novo mutations and use a probabilistic method to provide direct estimates of 1.27e−8 and 1.5e−9 per nucleotide per generation for SNVs and indels, respectively.
Hundreds of thousands of human genomes are now being sequenced to characterize genetic variation and use this information to augment association mapping studies of complex disorders and other phenotypic traits 1-4 . Genetic variation is identified mainly by mapping short reads to the reference genome or by performing local assembly 2,5-7 . However, these approaches are biased against discovery of structural variants and variation in the more complex parts of the genome. Hence, large-scale de novo assembly is needed. Here we show that it is possible to construct excellent de novo assemblies from high-coverage sequencing with mate-pair libraries extending up to 20 kilobases. We report de novo assemblies of 150 individuals (50 trios) from the GenomeDenmark project. The quality of these assemblies is similar to those obtained using the more expensive long-read technology 4,8-13 . We use the assemblies to identify a rich set of structural variants including many novel insertions and demonstrate how this variant catalogue enables further deciphering of known association mapping signals. We leverage the assemblies to provide 100 completely resolved major histocompatibility complex haplotypes and to resolve major parts of the Y chromosome. Our study provides a regional reference genome that we expect will improve the power of future association mapping studies and hence pave the way for precision medicine initiatives, which now are being launched in many countries including Denmark.Using a combination of high-depth (average 78× ) Illumina pairedend and mate-pair libraries, we applied Allpaths-LG 14 to create de novo assemblies of high quality and coverage for each of the 150 individuals with a median scaffold N50 of ~ 21 megabases (Mb; maximum ~ 30 Mb) (Supplementary Table 1). The 100 largest scaffolds in each of the 140 best assemblies typically covered more than 75% (median 77%, Extended Data Fig. 1a) of the genome, with the largest scaffolds exceeding 110 Mb in size (Supplementary Table 1). To evaluate the accuracy of the assemblies, we subsequently aligned the scaffolds for each individual to the human reference genome (GRCh38) 15 . Figure 1 shows an example individual where the euchromatic part of each chromosome was almost completely covered by a few large scaffolds and in several cases scaffolds covered almost entire chromosome arms. Only rarely did we find that large scaffolds break and align to more than one chromosome (Extended Data Fig. 1b), suggesting that even the largest scaffolds are seldom chimaeric. We also compared our de novo assemblies with a published long-read assembly based on BioNano mapping and PacBio sequencing 16 . Extended Data Figs 2a and 3 show that this assembly was less complete than our assemblies, but with similar scaffold lengths. The long-read assembly had 5.38% missing data compared with our median of 4.25% (Extended Data Fig. 3a), but the missing data in our assemblies were found in smaller gaps (Extended Data Fig. 3b, c), and the median contig length was therefore much smaller th...
Bladder cancer (or urothelial cell carcinoma [UCC]) is characterized by field disease (malignant alterations in surrounding mucosa) and frequent recurrences. Whole-genome, exome, and transcriptome sequencing of 38 tumors, including four metachronous tumor pairs and 20 superficial tumors, identified an APOBEC mutational signature in one-third. This was biased toward the sense strand, correlated with mean expression level, and clustered near breakpoints. A>G mutations were up to eight times more frequent on the sense strand (p<0.002) in [ACG]AT contexts. The patient-specific APOBEC signature was negatively correlated to repair-gene expression and was not related to clinicopathological parameters. Mutations in gene families and single genes were related to tumor stage, and expression of chromatin modifiers correlated with survival. Evolutionary and subclonal analyses of early/late tumor pairs showed a unitary origin, and discrete tumor clones contained mutated cancer genes. The ancestral clones contained Pik3ca/Kdm6a mutations and may reflect the field-disease mutations shared among later tumors.
Obligate mating of females (queens) with multiple males has evolved only rarely in social Hymenoptera (ants, social bees, social wasps) and for reasons that are fundamentally different from those underlying multiple mating in other animals. The monophyletic tribe of ('attine') fungus-growing ants is known to include evolutionarily derived genera with obligate multiple mating (the Acromyrmex and Atta leafcutter ants) as well as phylogenetically basal genera with exclusively single mating (e.g. Apterostigma, Cyphomyrmex, Myrmicocrypta). All attine genera share the unique characteristic of obligate dependence on symbiotic fungus gardens for food, but the sophistication of this symbiosis differs considerably across genera. The lower attine genera generally have small, short-lived colonies and relatively non-specialized fungal symbionts (capable of living independently of their ant hosts), whereas the four evolutionarily derived higher attine genera have highly specialized, long-term clonal symbionts. In this paper, we investigate whether the transition from single to multiple mating occurred relatively recently in the evolution of the attine ants, in conjunction with the novel herbivorous 'leafcutter' niche acquired by the common ancestor of Acromyrmex and Atta, or earlier, at the transition to rearing specialized long-term clonal fungi in the common ancestor of the larger group of higher attines that also includes the genera Trachymyrmex and Sericomyrmex. We use DNA microsatellite analysis to provide unambiguous evidence for a single, late and abrupt evolutionary transition from exclusively single to obligatory multiple mating. This transition is historically correlated with other evolutionary innovations, including the extensive use of fresh vegetation as substrate for the fungus garden, a massive increase in mature colony size and morphological differentiation of the worker caste.
Gene delivery by human immunodeficiency virus type 1 (HIV-1)-based lentiviral vectors (LVs) is efficient, but genomic integration of the viral DNA is strongly biased toward transcriptionally active loci resulting in an increased risk of insertional mutagenesis in gene therapy protocols. Nonviral Sleeping Beauty (SB) transposon vectors have a significantly safer insertion profile, but efficient delivery into relevant cell/tissue types is a limitation. In an attempt to combine the favorable features of the two vector systems we established a novel hybrid vector technology based on SB transposase-mediated insertion of lentiviral DNA circles generated during transduction of target cells with integrase (IN)-defective LVs (IDLVs). By construction of a lentivirus-transposon hybrid vector allowing transposition exclusively from circular viral DNA substrates, we demonstrate that SB transposase added in trans directs efficient transposon mobilization from DNA circles in vector-transduced cells. Both transfected plasmid DNA and transduced IDLVs can serve as the source of active transposase. Most important, we demonstrate that the SB transposase overrides the natural lentiviral integration pathway and directs vector integration less frequently toward transcriptional units, resulting in a random genomic integration profile. The novel hybrid vector system combines the attractive features of efficient gene delivery by viral transduction and a safer genomic integration profile by DNA transposition.
E.H.E. was supported by Health Faculty, Aarhus University and Kong Christian Den Tiendes Fond. K.H. and S.F. were supported by an MRC (UK) project grant MR/M012638/1. K.L.H. was supported by grants from Fonden til Lægevidenskabens Fremme, Kong Christian Den Tiendes Fond. K.L.H. and L.S. were supported by the IDEAS grant from Aarhus University Research Foundation (AUFF). There are no conflicts of interest.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.