Introduction:Trypanosoma cruzi, Trypanosoma brucei, and Leishmania spp., commonly referred to as TriTryps, are a group of protozoan parasites that cause important human diseases affecting millions of people belonging to the most vulnerable populations worldwide. Current treatments have limited efficiencies and can cause serious side effects, so there is an urgent need to develop new control strategies. Presently, the identification and prioritization of appropriate targets can be aided by integrative genomic and computational approaches.Methods: In this work, we conducted a genome-wide multidimensional data integration strategy to prioritize drug targets. We included genomic, transcriptomic, metabolic, and protein structural data sources, to delineate candidate proteins with relevant features for target selection in drug development.Results and Discussion: Our final ranked list includes proteins shared by TriTryps and covers a range of biological functions including essential proteins for parasite survival or growth, oxidative stress-related enzymes, virulence factors, and proteins that are exclusive to these parasites. Our strategy found previously described candidates, which validates our approach as well as new proteins that can be attractive targets to consider during the initial steps of drug discovery.
Motivation The use of high precision for representing quality scores in nanopore sequencing data makes these scores hard to compress and, thus, responsible for most of the information stored in losslessly compressed FASTQ files. This motivates the investigation of the effect of quality score information loss on downstream analysis from nanopore sequencing FASTQ files. Results We polished de novo assemblies for a mock microbial community and a human genome, and we called variants on a human genome. We repeated these experiments using various pipelines, under various coverage level scenarios, and various quality score quantizers. In all cases we found that the quantization of quality scores causes little difference (or even sometimes improves) on the results obtained with the original (non-quantized) data. This suggests that the precision that is currently used for nanopore quality scores may be unnecessarily high, and motivates the use of lossy compression algorithms for this kind of data. Moreover, we show that even a non-specialized compressor, like gzip, yields large storage space savings after quantization of quality scores. Availability Quantizers freely available for download at: https://github.com/mrivarauy/QS-Quantizer Supplementary information Available at https://github.com/mrivarauy/QS-Quantizer
We investigate the effect of quality score information loss on downstream analysis from nanopore sequencing FASTQ files. We polished denovo assemblies for a mock microbial community and a human genome, and we called variants on a human genome. We repeated these experiments using various pipelines, under various coverage level scenarios, and various quality score quantizers. In all cases we found that the quantization of quality scores cause little difference on (or even improves) the results obtained with the original (non-quantized) data. This suggests that the precision that is currently used for nanopore quality scores is unnecessarily high, and motivates the use of lossy compression algorithms for this kind of data. Moreover, we show that even a non-specialized compressor, like gzip, yields large storage space savings after quantization of quality scores.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.