2007
DOI: 10.1093/biostatistics/kxm013
|View full text |Cite
|
Sign up to set email alerts
|

Spatial smoothing and hot spot detection for CGH data using the fused lasso

Abstract: We apply the "fused lasso" regression method of (TSRZ2004) to the problem of "hot- spot detection", in particular, detection of regions of gain or loss in comparative genomic hybridization (CGH) data. The fused lasso criterion leads to a convex optimization problem, and we provide a fast algorithm for its solution. Estimates of false-discovery rate are also provided. Our studies show that the new method generally outperforms competing methods for calling gains and losses in CGH data.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
298
0
1

Year Published

2009
2009
2023
2023

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 307 publications
(301 citation statements)
references
References 9 publications
2
298
0
1
Order By: Relevance
“…The filtered sequencing data was then allocated to 20-kb sequencing bins with an average number of 30-35 reads per bin. To calculate the mean CNV variation, binned read counts at each genomic location were internally compared across a minimum of 15 sample data sets using the fused lasso smoothing algorithm [28]. Plots of log2 mean CNV (Y-axis) versus 20-kb bins (X-axis) were generated for each of the 24 chromosomes.…”
Section: Sequence Data Analysis and Derivation Of 24 Chromosome Profilesmentioning
confidence: 99%
“…The filtered sequencing data was then allocated to 20-kb sequencing bins with an average number of 30-35 reads per bin. To calculate the mean CNV variation, binned read counts at each genomic location were internally compared across a minimum of 15 sample data sets using the fused lasso smoothing algorithm [28]. Plots of log2 mean CNV (Y-axis) versus 20-kb bins (X-axis) were generated for each of the 24 chromosomes.…”
Section: Sequence Data Analysis and Derivation Of 24 Chromosome Profilesmentioning
confidence: 99%
“…Genomic segmentation was used to detect amplified and deleted segments with stringent parameters (Po0.0001, 420 markers, signal/noise X0.6, minimal region size ¼ 100 markers) To control for hyperfragmentation adjacent segments separated by o50 probes were combined into one single segment, and only segments 4100 probes were considered. Multiple hypothesis correction by Benjamini and Hochberg and by cghFLasso algorithm 21 was applied and FDR threshold was set at 0.05. Correspondence with gene expression was calculated by Spearman's (rank) correlation coefficient.…”
Section: Snp Arraymentioning
confidence: 99%
“…We note that closeness to the observed data is modeled by a quadratic term, whereas the prior assumptions are encoded in 1 -terms; thus the optimization problem is related to the LASSO (least absolute shrinkage and selection operator; Tibshirani (1996); Tibshirani and Wang (2007)). The main difference to the standard LASSO approaches is that we have several 1 terms with different weights in the objective function.…”
Section: Assumptionsmentioning
confidence: 99%
“…Our approach Our work follows the Bayesian paradigm as well, but (a) with more detailed prior assumptions (i.e., we do not use the beta-binomial model), and (b) using approximations that cast the maximum a-posteriori-estimation computational as a strictly convex optimization problem related to LASSO (least absolute shrinkage and selection operator) approaches (Tibshirani, 1996;Tibshirani and Wang, 2007). We estimate the (unknown) true group-specific methylation rate at each CpG by using both the available (small sample and low coverage) data and certain smoothness assumptions.…”
Section: Introductionmentioning
confidence: 99%