2017
DOI: 10.1093/bioinformatics/btx737
|View full text |Cite
|
Sign up to set email alerts
|

CALQ: compression of quality values of aligned sequencing data

Abstract: Supplementary data are available at Bioinformatics online.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
25
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 20 publications
(25 citation statements)
references
References 30 publications
0
25
0
Order By: Relevance
“…When it comes to aligned data, an MPEG-G encoder could use a compression method comparable to that of DeeZ [8], which is able to compress a 437 GB H. Sapiens SAM file to about 63 GB, as compared to 75 GB by CRAM (Scramble) or 106 GB by BAM [8]. Regarding quantization of quality values, methods like QVZ [9] and CALQ [10] could be applied yielding overall compression gains of 10x over BAM, while preserving, or even improving, variant calling performance [11].…”
Section: Compression Capabilitiesmentioning
confidence: 99%
See 2 more Smart Citations
“…When it comes to aligned data, an MPEG-G encoder could use a compression method comparable to that of DeeZ [8], which is able to compress a 437 GB H. Sapiens SAM file to about 63 GB, as compared to 75 GB by CRAM (Scramble) or 106 GB by BAM [8]. Regarding quantization of quality values, methods like QVZ [9] and CALQ [10] could be applied yielding overall compression gains of 10x over BAM, while preserving, or even improving, variant calling performance [11].…”
Section: Compression Capabilitiesmentioning
confidence: 99%
“…In the case of unaligned reads, an MPEG-G compliant encoder is free to choose any beneficial quantization scheme. This includes quantization schemes of recently published research such as [9,10,11,12,13]. The specific quantization scheme used is signaled to a decoder by the means of a Quality Value Codebook.…”
Section: Compression Modes For Quality Valuesmentioning
confidence: 99%
See 1 more Smart Citation
“…This idea has been further improved by the more recent QVZ and QVZ 2 compressors [14,10]. Besides the binning and statistical inference approaches, there are other efforts which exploit the information contained in the readout nucleotide sequences [11,23,21]. For example, the Quartz compressor [23] sets the quality scores of the most frequent k-mers to a predefined high value with the motivation that if a specific nucleotide sequence is observed many times, then its correctness does not need any further verification from the quality scores.…”
Section: Previous Studiesmentioning
confidence: 99%
“…"Vertical" compression takes a slice through an aligned dataset in the SAM format (Li et al, 2009) to determine which qualities to keep and which to discard, as used in CALQ (Voges et al, 2017), or via hashing techniques on unaligned data in Leon (Benoit et al, 2015) and GeneCodeq (Greenfield et al, 2016). Traditional loss measures, such as mean squared error, will appear very high, but these tools focus on minimising the changes in post-processed data (variant calling).…”
Section: Introductionmentioning
confidence: 99%