We developed a novel software tool, EXCAVATOR, for the detection of copy number variants (CNVs) from whole-exome sequencing data. EXCAVATOR combines a three-step normalization procedure with a novel heterogeneous hidden Markov model algorithm and a calling method that classifies genomic regions into five copy number states. We validate EXCAVATOR on three datasets and compare the results with three other methods. These analyses show that EXCAVATOR outperforms the other methods and is therefore a valuable tool for the investigation of CNVs in largescale projects, as well as in clinical research and diagnostics. EXCAVATOR is freely available at http://sourceforge.net/projects/excavatortool/.
Structural variants are genomic rearrangements larger than 50 bp accounting for around 1% of the variation among human genomes. They impact on phenotypic diversity and play a role in various diseases including neurological/neurocognitive disorders and cancer development and progression. Dissecting structural variants from next-generation sequencing data presents several challenges and a number of approaches have been proposed in the literature. In this mini review, we describe and summarize the latest tools – and their underlying algorithms – designed for the analysis of whole-genome sequencing, whole-exome sequencing, custom captures, and amplicon sequencing data, pointing out the major advantages/drawbacks. We also report a summary of the most recent applications of third-generation sequencing platforms. This assessment provides a guided indication – with particular emphasis on human genetics and copy number variants – for researchers involved in the investigation of these genomic events.
The nanopore sequencing process is based on the transit of a DNA molecule through a nanoscopic pore, and since the 90s is considered as one of the most promising approaches to detect polymeric molecules. In 2014, Oxford Nanopore Technologies (ONT) launched a beta-testing program that supplied the scientific community with the first prototype of a nanopore sequencer: the MinION. Thanks to this program, several research groups had the opportunity to evaluate the performance of this novel instrument and develop novel computational approaches for analyzing this new generation of data. Despite the short period of time from the release of the MinION, a large number of algorithms and tools have been developed for base calling, data handling, read mapping, de novo assembly and variant discovery. Here, we face the main computational challenges related to the analysis of nanopore data, and we carry out a comprehensive and up-to-date survey of the algorithmic solutions adopted by the bioinformatic community comparing performance and reporting limits and advantages of using this new generation of sequences for genomic analyses. Our analyses demonstrate that the use of nanopore data dramatically improves the de novo assembly of genomes and allows for the exploration of structural variants with an unprecedented accuracy and resolution. However, despite the impressive improvements reached by ONT in the past 2 years, the use of these data for small-variant calling is still challenging, and at present, it needs to be coupled with complementary short sequences for mitigating the intrinsic biases of nanopore sequencing technology.
We proposed a novel computational framework, named chimEric tranScript detection algorithm (EricScript), for the identification of gene fusion products in paired-end RNA-seq data. Our simulation study on synthetic data demonstrates that EricScript enables to achieve higher sensitivity and specificity than existing methods with noticeably lower running times. We also applied our method to publicly available RNA-seq tumour datasets, and we showed its capability in rediscovering known gene fusions.
After haplotype reconstruction, logistic regression analyses adjusted for traditional risk factors and COPD showed a significant association among AAA and AHCY, FOLH1, MTHFD1, MTR, NNMT, PON1 and TYMS haplotypes. Our findings offer new insights into the pathogenesis of AAA.
Acute tissue injury causes DNA damage and repair processes involving increased cell mitosis and polyploidization, leading to cell function alterations that may potentially drive cancer development. Here, we show that acute kidney injury (AKI) increased the risk for papillary renal cell carcinoma (pRCC) development and tumor relapse in humans as confirmed by data collected from several single-center and multicentric studies. Lineage tracing of tubular epithelial cells (TECs) after AKI induction and long-term follow-up in mice showed time-dependent onset of clonal papillary tumors in an adenoma-carcinoma sequence. Among AKI-related pathways, NOTCH1 overexpression in human pRCC associated with worse outcome and was specific for type 2 pRCC. Mice overexpressing NOTCH1 in TECs developed papillary adenomas and type 2 pRCCs, and AKI accelerated this process. Lineage tracing in mice identified single renal progenitors as the cell of origin of papillary tumors. Single-cell RNA sequencing showed that human renal progenitor transcriptome showed similarities to PT1, the putative cell of origin of human pRCC. Furthermore, NOTCH1 overexpression in cultured human renal progenitor cells induced tumor-like 3D growth. Thus, AKI can drive tumorigenesis from local tissue progenitor cells. In particular, we find that AKI promotes the development of pRCC from single progenitors through a classical adenoma-carcinoma sequence.
The Oxford Nanopore Technologies MinION is a new device, based on nanopore sequencing that is able to generate reads of tens of kilobases in length with faster sequencing time with respect to other platforms. To evaluate the capability of nanopore data to be exploited for resequencing analyses we used the largest MinION data set to date and we compared with Illumina and Pacific Biosciences technologies. By using five different mapping approaches we estimated that the global sequencing error rate of MinION reads, mainly caused by inserted and deleted bases, is around 11%. The study of error distribution showed that substituted, inserted and deleted bases are not randomly distributed along the reads, but mainly occur in specific nucleotide patterns, generating a significant number of genomic loci that can be misclassified as false-positive variants. With 40× sequencing coverage, MinION data can produce at best around one false substitution and insertion every 10-50 kb, and one false deletion every 1000 bp, making use of this technology still challenging for small-sized variant discovery. We also analyzed depth of coverage distribution and we demonstrated that nanopore sequencing is a uniform process that generates sequences randomly and independently without classical sources of bias such as GC-content and mappability. Owing to these properties, the MinION data can be readily used to detect genomic regions involved in copy number variants with high accuracy, outperforming other state-of-the-art sequencing methods in terms of both sensitivity and specificity.
Copy Number Variants (CNVs) are structural rearrangements contributing to phenotypic variation that have been proved to be associated with many disease states. Over the last years, the identification of CNVs from whole-exome sequencing (WES) data has become a common practice for research and clinical purpose and, consequently, the demand for more and more efficient and accurate methods has increased. In this paper, we demonstrate that more than 30% of WES data map outside the targeted regions and that these reads, usually discarded, can be exploited to enhance the identification of CNVs from WES experiments. Here, we present EXCAVATOR2, the first read count based tool that exploits all the reads produced by WES experiments to detect CNVs with a genome-wide resolution. To evaluate the performance of our novel tool we use it for analysing two WES data sets, a population data set sequenced by the 1000 Genomes Project and a tumor data set made of bladder cancer samples. The results obtained from these analyses demonstrate that EXCAVATOR2 outperforms other four state-of-the-art methods and that our combined approach enlarge the spectrum of detectable CNVs from WES data with an unprecedented resolution. EXCAVATOR2 is freely available at http://sourceforge.net/projects/excavator2tool/.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.