2020
DOI: 10.1093/bioinformatics/btaa051
|View full text |Cite
|
Sign up to set email alerts
|

BioSeqZip: a collapser of NGS redundant reads for the optimization of sequence analysis

Abstract: Motivation High-throughput next-generation sequencing can generate huge sequence files, whose analysis requires alignment algorithms that are typically very demanding in terms of memory and computational resources. This is a significant issue, especially for machines with limited hardware capabilities. As the redundancy of the sequences typically increases with coverage, collapsing such files into compact sets of non-redundant reads has the 2-fold advantage of reducing file size and speeding-… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
24
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(25 citation statements)
references
References 23 publications
1
24
0
Order By: Relevance
“…Compared with first-generation sequencing technology, high-throughput sequencing technology is a milestone in the field of genomics because of lower costs and better opportunity to identify DEGs for further studying the pathogenesis of diseases. 39 41 Brucella can be transmitted through contaminated placenta and aerosol, and can enter the host through gastrointestinal and respiratory mucosae. The mucosal immune response is the host’s main defense host against invasive Brucella .…”
Section: Discussionmentioning
confidence: 99%
“…Compared with first-generation sequencing technology, high-throughput sequencing technology is a milestone in the field of genomics because of lower costs and better opportunity to identify DEGs for further studying the pathogenesis of diseases. 39 41 Brucella can be transmitted through contaminated placenta and aerosol, and can enter the host through gastrointestinal and respiratory mucosae. The mucosal immune response is the host’s main defense host against invasive Brucella .…”
Section: Discussionmentioning
confidence: 99%
“…Tools and data source: The tools selected for this analysis were MarDRe (Expósito et al, 2017), ParDRe, FastUniq (Xu et al, 2012), NGS Reads Treatment (Gaia et al, 2019), and BioSeqZip (Urgese et al, 2020). They were chosen because they are tools capable of manipulating platform-independent NGS data and are freely available to the scientific community.…”
Section: Methodsmentioning
confidence: 99%
“…BioSeqZip [10] consume almost the same memory amount in all datasets because of its memory limit control. Therefore, it has a smaller memory footprint than Fast-HBR in all datasets except SRR10315305.…”
Section: Ngs Reads Treatmentmentioning
confidence: 99%
“…Some examples of de novo tools are CD-HIT [ 6 ], FastUniq [ 7 ] and Fulcrum [ 8 ]. Available de novo tools include NGS Reads Treatment [ 9 ], Nubeam-dedup [ 5 ], BioSeqZip [ 10 ] and Minirmd [ 11 ]. NGS Reads Treatment [ 9 ] is a hash-based tool that uses Cuckoo Filter [ 12 ] which is a probabilistic data structure.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation