Identification of cancer omics commonality and difference via community fusion

Sun, Yifan; Jiang, Yu; Li, Yang; Ma, Shuangge

doi:10.1002/sim.8027

Cited by 12 publications

(16 citation statements)

References 36 publications

(46 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Considering the limited sample size, we select a subset of “interesting” genes for analysis. In particular, in a recent study, 26 a gene expression network is first constructed, and the network communities (also referred to as modules) are identified. This analysis accounts for the community structure in regularized regression, and suggests that 126 genes are potentially associated with Breslow thickness.…”

Section: Discussionmentioning

confidence: 99%

“…It is noted with significantly inferior performance, the meta‐PLS method is not considered, leading to a total of six methods. In Figure 1, we show representative analysis results (for communities 3, 5, and 42) ‐ we refer to the literature 26 for detailed information on these gene communities. Here each row corresponds to one dataset/stage, and each column corresponds to one community.…”

Section: Discussionmentioning

confidence: 99%

“…The analysis goal is to identify the first direction vectors that can best link gene expressions with FEV1. Similar to the above analysis, here we also follow the literature 26 and focus on 474 genes from 26 network communities, which have been suggested as potentially related to FEV1. Analysis is conducted in the same manner as above.…”

Section: Analysis Of Lung Cancer Datamentioning

confidence: 99%

See 2 more Smart Citations

Integrative sparse partial least squares

Liang

Zhang

et al. 2021

Statistics in Medicine

Self Cite

View full text Add to dashboard Cite

Partial least squares, as a dimension reduction technique, has become increasingly important for its ability to deal with problems with a large number of variables. Since noisy variables may weaken estimation performance, the sparse partial least squares (SPLS) technique has been proposed to identify important variables and generate more interpretable results. However, the small sample size of a single dataset limits the performance of conventional methods. An effective solution comes from gathering information from multiple comparable studies. Integrative analysis has essential importance in multidatasets analysis. The main idea is to improve performance by assembling raw data from multiple independent datasets and analyzing them jointly. In this article, we develop an integrative SPLS (iSPLS) method using penalization based on the SPLS technique. The proposed approach consists of two penalties. The first penalty conducts variable selection under the context of integrative analysis. The second penalty, a contrasted penalty, is imposed to encourage the similarity of estimates across datasets and generate more sensible and accurate results. Computational algorithms are developed. Simulation experiments are conducted to compare iSPLS with alternative approaches. The practical utility of iSPLS is shown in the analysis of two TCGA gene expression data. K E Y W O R D Scontrasted penalization, integrative analysis, partial least squares INTRODUCTIONData with high-dimensional variables are becoming routine. With such data, partial least squares, initially developed by Wold et al, 1 has been successfully used as a dimension reduction method in many areas such as chemometrics 2 and genetics. 3 PLS reduces variable dimension by constructing new components, which are linear combinations of the original variables. It possesses much-desired properties such as stability under collinearity and high-dimensionality, rendering it clear superiority over many other methods. In high-dimensional analysis, noise accumulation from irrelevant variables has long been recognized. 4 For example, in omics studies, it is wildly accepted that only a small fraction of genes are associated with outcomes. To yield more accurate estimation and facilitate interpretation, variable selection needs to be considered. Chun and Keleş 5 propose a sparse PLS (SPLS) technique to conduct variable selection and dimension reduction simultaneously by imposing the elastic net penalization in the PLS optimization.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Discussionmentioning

confidence: 99%

Section: Analysis Of Lung Cancer Datamentioning

confidence: 99%

See 1 more Smart Citation

Integrative sparse partial least squares

Liang

Zhang

et al. 2021

Statistics in Medicine

Self Cite

View full text Add to dashboard Cite

show abstract

“…One goal of the study was to investigate genetic and environmental effects on the development of melanoma. For patients in different clinical stages, genes can have different effects on the Breslow thickness, which is a continuous variable and has been extensively used as a prognostic indicator for melanoma . Therefore, subjects with the three stages should not be analyzed as a single population, which motivates us to develop an appropriate analysis method to handle such differences across multiple datasets.…”

Section: Introductionmentioning

confidence: 99%

Penalized integrative semiparametric interaction analysis for multiple genetic datasets

Lin

et al. 2019

Statistics in Medicine

Self Cite

View full text Add to dashboard Cite

In this article, we consider a semiparametric additive partially linear interaction model for the integrative analysis of multiple genetic datasets. The goals are to identify important genetic predictors and gene-gene interactions and to estimate the nonparametric functions that describe the environmental effects at the same time. To find the similarities and differences of the genetic effects across different datasets, we impose a group structure on the regression coefficients matrix under the homogeneity assumption, ie, models for different datasets share the same sparsity structure, but the coefficients may differ across datasets. We develop an iterative approach to estimate the parameters of main effects, interactions and nonparametric functions, where a reparametrization of interaction parameters is implemented to meet the strong hierarchy assumption. We demonstrate the advantages of the proposed method in identification, estimation, and prediction in a series of numerical studies. We also apply the proposed method to the Skin Cutaneous Melanoma data and the lung cancer data from the Cancer Genome Atlas.

show abstract

“…It is thus of interest to develop the proposed method beyond the penalization one. 16 The rest of the article is organized as follows. In Section 2, we describe the data and model settings.…”

Section: Introductionmentioning

confidence: 99%

An integrative sparse boosting analysis of cancer genomic commonality and difference

Sun

Jiang

et al. 2019

Stat Methods Med Res

Self Cite

View full text Add to dashboard Cite

In cancer research, high-throughput profiling has been extensively conducted. In recent studies, the integrative analysis of data on multiple cancer patient groups/subgroups has been conducted. Such analysis has the potential to reveal the genomic commonality as well as difference across groups/subgroups. However, in the existing literature, methods with a special attention to the genomic commonality and difference are very limited. In this study, a novel estimation and marker selection method based on the sparse boosting technique is developed to address the commonality/difference problem. In terms of technical innovation, a new penalty and computation of increments are introduced. The proposed method can also effectively accommodate the grouping structure of covariates. Simulation shows that it can outperform direct competitors under a wide spectrum of settings. The analysis of two TCGA (The Cancer Genome Atlas) datasets is conducted, showing that the proposed analysis can identify markers with important biological implications and have satisfactory prediction and stability.

show abstract

Identification of cancer omics commonality and difference via community fusion

Cited by 12 publications

References 36 publications

Integrative sparse partial least squares

Integrative sparse partial least squares

Penalized integrative semiparametric interaction analysis for multiple genetic datasets

An integrative sparse boosting analysis of cancer genomic commonality and difference

Contact Info

Product

Resources

About