Paula Wu scite author profile

Paula Wu

3Publications

18Citation Statements Received

88Citation Statements Given

How they've been cited

How they cite others

Affiliations

Hong Kong Polytechnic University

Publications

Order By: Most citations

Compression of Multiple DNA Sequences Using Intra-Sequence and Inter-Sequence Similarities

Cheng

Law

et al. 2015

IEEE/ACM Trans. Comput. Biol. and Bioinf.

View full text Add to dashboard Cite

Traditionally, intra-sequence similarity is exploited for compressing a single DNA sequence. Recently, remarkable compression performance of individual DNA sequence from the same population is achieved by encoding its difference with a nearly identical reference sequence. Nevertheless, there is lack of general algorithms that also allow less similar reference sequences. In this work, we extend the intra-sequence to the inter-sequence similarity in that approximate matches of subsequences are found between the DNA sequence and a set of reference sequences. Hence, a set of nearly identical DNA sequences from the same population or a set of partially similar DNA sequences like chromosome sequences and DNA sequences of related species can be compressed together. For practical compressors, the compressed size is usually influenced by the compression order of sequences. Fast search algorithms for the optimal compression order are thus developed for multiple sequences compression. Experimental results on artificial and real datasets demonstrate that our proposed multiple sequences compression methods with fast compression order search are able to achieve good compression performance under different levels of similarity in the multiple DNA sequences.

show abstract

Analysis of cross sequence similarities for multiple DNA sequences compression

Law

Siu

2009

IJCAET

View full text Add to dashboard Cite

Current DNA compression algorithms rely on finding repetitions within the DNA sequence so that similar subsequences can be encoded by referencing to each other. We explore similarities between different chromosomes of the sequence 'Saccharomyces cerevisiae'. These similarities are characterised by the existence of similar subsequences among different chromosomes. The longer the similar subsequences are, the higher the cross-similarities are. Our study indicates that these cross-sequence similarities are often significant as compared to self-sequence similarity. This implies that it would be advantageous to compress two or more chromosome sequences together so that similar subsequences found between multiple chromosome sequences can be encoded together.

show abstract

Study Of Inter-sequence Similarity For Multiple DNA Sequence Compression

Wu¹,

Law²,

Siu³

2007

View full text Add to dashboard Cite

Articles you may be interested inCoarse-grained modeling of DNA oligomer hybridization: Length, sequence, and salt effects J. Chem. Phys. 141, 035102 (2014); 10.1063/1.4886336 Study on the stability of the Quadruplex DNA Structure formed by the human telomeric repeat sequence d [ AG 3 ( TTAGGG ) 3 ] AIP Conf. Proc. 1071, 62 (2008); 10.1063/1.3033361 Low-energy electron diffraction and induced damage in hydrated DNAAbstract. Current DNA compression algorithms rely on finding repetitions within the DNA sequence so that similar subsequences can be encoded by referencing to each other. In this paper, we explore similarities between different chromosomes of the sequence "Saccharomyces cerevisiae". These similarities are characterized by the existence of similar subsequences among different chromosomes. The longer the similar subsequences are, the higher the crosssimilarities are. Our study indicates that these cross-sequence similarities are often significant as compared to self-sequence similarities. This implies that it would be advantageous to compress two or more sequences together so that similar subsequences found between multiple sequences can be encoded together.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Paula Wu

Compression of Multiple DNA Sequences Using Intra-Sequence and Inter-Sequence Similarities

Analysis of cross sequence similarities for multiple DNA sequences compression

Study Of Inter-sequence Similarity For Multiple DNA Sequence Compression

Contact Info

Product

Resources

About