Summary Third-generation sequencing technologies from companies such as Oxford Nanopore and Pacific Biosciences have paved the way for building more contiguous and potentially gap-free assemblies. The larger effective length of their reads has provided a means to overcome the challenges of short to mid-range repeats. Currently, accurate long read assemblers are computationally expensive, whereas faster methods are not as accurate. Moreover, despite recent advances in third-generation sequencing, researchers still tend to generate accurate short reads for many of the analysis tasks. Here, we present HASLR, a hybrid assembler that uses error-prone long reads together with high-quality short reads to efficiently generate accurate genome assemblies. Our experiments show that HASLR is not only the fastest assembler but also the one with the lowest number of misassemblies on most of the samples, while being on par with other assemblers in terms of contiguity and accuracy.
MotivationDespite recent advances in algorithms design to characterize structural variation using high-throughput short read sequencing (HTS) data, characterization of novel sequence insertions longer than the average read length remains a challenging task. This is mainly due to both computational difficulties and the complexities imposed by genomic repeats in generating reliable assemblies to accurately detect both the sequence content and the exact location of such insertions. Additionally, de novo genome assembly algorithms typically require a very high depth of coverage, which may be a limiting factor for most genome studies. Therefore, characterization of novel sequence insertions is not a routine part of most sequencing projects.There are only a handful of algorithms that are specifically developed for novel sequence insertion discovery that can bypass the need for the whole genome de novo assembly. Still, most such algorithms rely on high depth of coverage, and to our knowledge there is only one method (PopIns) that can use multi-sample data to “collectively” obtain a very high coverage dataset to accurately find insertions common in a given population.ResultHere, we present Pamir, a new algorithm to efficiently and accurately discover and genotype novel sequence insertions using either single or multiple genome sequencing datasets. Pamir is able to detect breakpoint locations of the insertions and calculate their zygosity (i.e. heterozygous versus homozygous) by analyzing multiple sequence signatures, matching one-end-anchored sequences to small-scale de novo assemblies of unmapped reads, and conducting strand-aware local assembly. We test the efficacy of Pamir on both simulated and real data, and demonstrate its potential use in accurate and routine identification of novel sequence insertions in genome projects.Availability and implementationPamir is available at https://github.com/vpc-ccg/pamir.Supplementary information Supplementary data are available at Bioinformatics online.
Clear-cell renal cell carcinoma (ccRCC) is a common therapy resistant disease with aberrant angiogenic and immunosuppressive features. Patients with metastatic disease are treated with targeted therapies based on clinical features: low-risk patients are usually treated with anti-angiogenic drugs and intermediate/high-risk patients with immune therapy. However, there are no biomarkers available to guide treatment choice for these patients. A recently published phase II clinical trial observed a correlation between ccRCC patients' clustering and their response to targeted therapy. However, the clustering of these groups was not distinct. Here, we analyzed the gene expression profile of 469 ccRCC patients, using featured selection technique, and have developed a refined 66-gene signature for improved sub-classification of patients. Moreover, we have identified a novel comprehensive expression profile to distinguish between migratory stromal and immune cells. Furthermore, the proposed 66-gene signature was validated using a different cohort of 64 ccRCC patients. These findings are foundational for the development of reliable biomarkers that may guide treatment decision-making and improve therapy response in ccRCC patients. Clear-cell renal cell carcinoma (ccRCC) tumors have been reported to be highly angiogenic and with immunosuppressive features 1,2. Recent publications show increased expression of the immune inhibitory ligand and receptors (PD-L1/CTLA4) on tumor cells and/or tumor-infiltrating immune cells 3,4. Currently, tumor mutation burden is considered a predictive biomarker for response to immune checkpoint inhibitors (ICIs). However, research studies have shown that ccRCC has low mutational burden but highest immune infiltration score compared to other cancer types 5,6. In a different study, a pan-cancer analysis found renal cell carcinomas (RCC) to have the highest proportion of indel mutations, which can increase tumor neoantigen abundance 7. These anomalies in ccRCC make it the perfect platform to study dynamic biomarkers. ccRCC patients with clinically localised tumor undergo partial or radical nephrectomy, but ~30% of the patients present with de novo metastatic disease 8. Metastatic patients are usually treated with systemic therapies based on the clinical features 9. The prognostic value of different risk stratification tools is limited to clinical and pathological features of the patients 10. In Canada, International Metastatic Renal Cell Carcinoma Database Consortium (IMDC) risk model is applied with six clinical and laboratory factors: Karnofsky performance status, time of first-line targeted therapy from diagnosis, haemoglobin concentration, serum calcium concentration, neutrophil and platelet counts 9. According to the IMDC risk stratification, low-risk metastatic RCC patients are usually treated with anti-angiogenic tyrosine kinase inhibitors (TKIs) and intermediate/high-risk patients with ICIs 11. Risk stratification models based on gene expression pattern (both messenger and long non-coding RN...
Outcomes of DSAEK performed by cornea fellows supervised by the faculty members seems to be fairly acceptable.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.