2015
DOI: 10.6026/97320630011267
|View full text |Cite
|
Sign up to set email alerts
|

Compression of Large genomic datasets using COMRAD on Parallel Computing Platform

Abstract: The big data storage is a challenge in a post genome era. Hence, there is a need for high performance computing solutions for managing large genomic data. Therefore, it is of interest to describe a parallel-computing approach using message-passing library for distributing the different compression stages in clusters. The genomic compression helps to reduce the on disk“foot print” of large data volumes of sequences. This supports the computational infrastructure for a more efficient archiving. The approach was … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2015
2015
2019
2019

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(5 citation statements)
references
References 11 publications
(14 reference statements)
0
5
0
Order By: Relevance
“…However, not all the methods in Section 5 can provide effective compression for sequences which only share partial similarity in certain portions. Methods that can be applied for characterising partial similarity include COMRAD [52,53], ERGC [54], CoGI [55], MSC [56] and referential compression algorithm [81].…”
Section: Reference-based Dna Compression Methods For Partially Similamentioning
confidence: 99%
See 4 more Smart Citations
“…However, not all the methods in Section 5 can provide effective compression for sequences which only share partial similarity in certain portions. Methods that can be applied for characterising partial similarity include COMRAD [52,53], ERGC [54], CoGI [55], MSC [56] and referential compression algorithm [81].…”
Section: Reference-based Dna Compression Methods For Partially Similamentioning
confidence: 99%
“…These rules are then entropy encoded. Example rule-based compression methods are DNA Sequitur [31] and COMRAD [52,53]. In parsing methods, the DNA sequence is usually divided into a number of phrases using a greedy strategy where the phrases denote repeats in the sequence.…”
Section: Signal Processing Techniques For Identifying Sequence Similamentioning
confidence: 99%
See 3 more Smart Citations