Kin-On Cheng scite author profile

Background: The DNA microarray technology allows the measurement of expression levels of thousands of genes under tens/hundreds of different conditions. In microarray data, genes with similar functions usually co-express under certain conditions only [1]. Thus, biclustering which clusters genes and conditions simultaneously is preferred over the traditional clustering technique in discovering these coherent genes. Various biclustering algorithms have been developed using different bicluster formulations. Unfortunately, many useful formulations result in NP-complete problems. In this article, we investigate an efficient method for identifying a popular type of biclusters called additive model. Furthermore, parallel coordinate (PC) plots are used for bicluster visualization and analysis.

show abstract

BiVisu: software tool for bicluster detection and visualization

Cheng

Law

Siu

et al. 2007

View full text Add to dashboard Cite

show abstract

Iterative bicluster-based least square framework for estimation of missing values in microarray gene expression data

Cheng

Law

Siu

2012

Pattern Recognition

View full text Add to dashboard Cite

DNA microarray experiment inevitably generates gene expression data with missing values. An important and necessary pre-processing step is thus to impute these missing values. Existing imputation methods exploit gene correlation among all experimental conditions for estimating the missing values. However, related genes coexpress in subsets of experimental conditions only. In this paper, we propose to use biclusters which contain similar genes under subset of conditions for characterizing the gene similarity and then estimating the missing values. To further improve the accuracy in missing value estimation, an iterative framework is developed with a stopping criterion on minimizing uncertainty. Extensive experiments have been conducted on artificial datasets, real microarray datasets as well as one non-microarray dataset. Our proposed biclusters-based approach This is the Pre-Published Version.2 is able to reduce errors in missing value estimation.

show abstract

Multiscale directional filter bank with applications to structured and random texture retrieval

Cheng

Law

Siu

2007

Pattern Recognition

View full text Add to dashboard Cite

Clustering-Based Compression for Population DNA Sequences

Cheng

Law

Siu

2019

IEEE/ACM Trans. Comput. Biol. and Bioinf.

View full text Add to dashboard Cite

A Novel Fast and Reduced Redundancy Structure for Multiscale Directional Filter Banks

Cheng

Law

Siu

2007

IEEE Trans. on Image Process.

View full text Add to dashboard Cite

The multiscale directional filter bank (MDFB) improves the radial frequency resolution of the contourlet transform by introducing an additional decomposition in the high-frequency band. The increase in frequency resolution is particularly useful for texture description because of the quasi-periodic property of textures. However, the MDFB needs an extra set of scale and directional decomposition, which is performed on the full image size. The rise in computational complexity is, thus, prominent. In this paper, we develop an efficient implementation framework for the MDFB. In the new framework, directional decomposition on the first two scales is performed prior to the scale decomposition. This allows sharing of directional decomposition among the two scales and, hence, reduces the computational complexity significantly. Based on this framework, two fast implementations of the MDFB are proposed. The first one can maintain the same flexibility in directional selectivity in the first two scales while the other has the same redundancy ratio as the contourlet transform. Experimental results show that the first and the second schemes can reduce the computational time by 33.3%-34.6% and 37.1%-37.5%, respectively, compared to the original MDFB algorithm. Meanwhile, the texture retrieval performance of the proposed algorithms is more or less the same as the original MDFB approach which outperforms the steerable pyramid and the contourlet transform approaches.

show abstract

Compression of Multiple DNA Sequences Using Intra-Sequence and Inter-Sequence Similarities

Cheng

Law

et al. 2015

IEEE/ACM Trans. Comput. Biol. and Bioinf.

View full text Add to dashboard Cite

Traditionally, intra-sequence similarity is exploited for compressing a single DNA sequence. Recently, remarkable compression performance of individual DNA sequence from the same population is achieved by encoding its difference with a nearly identical reference sequence. Nevertheless, there is lack of general algorithms that also allow less similar reference sequences. In this work, we extend the intra-sequence to the inter-sequence similarity in that approximate matches of subsequences are found between the DNA sequence and a set of reference sequences. Hence, a set of nearly identical DNA sequences from the same population or a set of partially similar DNA sequences like chromosome sequences and DNA sequences of related species can be compressed together. For practical compressors, the compressed size is usually influenced by the compression order of sequences. Fast search algorithms for the optimal compression order are thus developed for multiple sequences compression. Experimental results on artificial and real datasets demonstrate that our proposed multiple sequences compression methods with fast compression order search are able to achieve good compression performance under different levels of similarity in the multiple DNA sequences.

show abstract

Use of biclustering for missing value imputation in gene expression data

Cheng

Law²,

Siu³

2013

AIR

View full text Add to dashboard Cite

DNA microarray data always contains missing values. As subsequent analysis such as biclustering can only be applied on complete data, these missing values have to be imputed before any biclusters can be detected. Existing imputation methods exploit coherence among expression values in the microarray data. In view that biclustering attempts to find correlated expression values within the data, we propose to combine the missing value imputation and biclustering into a single framework in which the two processes are performed iteratively. In this way, the missing value imputation can improve bicluster analysis and the coherence in detected biclusters can be exploited for better missing value estimation. Experiments have been conducted on artificial datasets and real datasets to verify the effectiveness of the proposed algorithm in reducing estimation errors of missing values.

show abstract

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kin-On Cheng

Identification of coherent patterns in gene expression data using an efficient biclustering algorithm and parallel coordinate visualization

BiVisu: software tool for bicluster detection and visualization

Iterative bicluster-based least square framework for estimation of missing values in microarray gene expression data

Multiscale directional filter bank with applications to structured and random texture retrieval

Clustering-Based Compression for Population DNA Sequences

A Novel Fast and Reduced Redundancy Structure for Multiscale Directional Filter Banks

Compression of Multiple DNA Sequences Using Intra-Sequence and Inter-Sequence Similarities

Use of biclustering for missing value imputation in gene expression data

Contact Info

Product

Resources

About