Array-based comparative genomic hybridization (aCGH) enables the measurement of DNA copy number across thousands of locations in a genome. The main goals of analyzing aCGH data are to identify the regions of copy number variation (CNV) and to quantify the amount of CNV. Although there are many methods for analyzing single-sample aCGH data, the analysis of multi-sample aCGH data is a relatively new area of research. Further, many of the current approaches for analyzing multi-sample aCGH data do not appropriately utilize the additional information present in the multiple samples. We propose a procedure called the Fused Lasso Latent Feature Model (FLLat) that provides a statistical framework for modeling multi-sample aCGH data and identifying regions of CNV. The procedure involves modeling each sample of aCGH data as a weighted sum of a fixed number of features. Regions of CNV are then identified through an application of the fused lasso penalty to each feature. Some simulation analyses show that FLLat outperforms single-sample methods when the simulated samples share common information. We also propose a method for estimating the false discovery rate. An analysis of an aCGH data set obtained from human breast tumors, focusing on chromosomes 8 and 17, shows that FLLat and Significance Testing of Aberrant Copy number (an alternative, existing approach) identify similar regions of CNV that are consistent with previous findings. However, through the estimated features and their corresponding weights, FLLat is further able to discern specific relationships between the samples, for example, identifying 3 distinct groups of samples based on their patterns of CNV for chromosome 17.
When applying hierarchical clustering algorithms to cluster patient samples from microarray data, the clustering patterns generated by most algorithms tend to be dominated by groups of highly differentially expressed genes that have closely related expression patterns. Sometimes, these genes may not be relevant to the biological process under study or their functions may already be known. The problem is that these genes can potentially drown out the effects of other genes that are relevant or have novel functions. We propose a procedure called complementary hierarchical clustering that is designed to uncover the structures arising from these novel genes that are not as highly expressed. Simulation studies show that the procedure is effective when applied to a variety of examples. We also define a concept called relative gene importance that can be used to identify the influential genes in a given clustering. Finally, we analyze a microarray data set from 295 breast cancer patients, using clustering with the correlation-based distance measure. The complementary clustering reveals a grouping of the patients which is uncorrelated with a number of known prognostic signatures and significantly differing distant metastasis-free probabilities.
BackgroundStudying centre-to-centre (CTC) variation in mortality rates is important because inferences about quality of care can be made permitting changes in practice to improve outcomes. However, comparisons between hospitals can be misleading unless there is adjustment for population characteristics and severity of illness.ObjectiveWe sought to report the risk-adjusted CTC variation in mortality among preterm infants born <32 weeks and admitted to all eight tertiary neonatal intensive care units (NICUs) in the New South Wales and the Australian Capital Territory Neonatal Network (NICUS), Australia.MethodsWe analysed routinely collected prospective data for births between 2007 and 2014. Adjusted mortality rates for each NICU were produced using a multiple logistic regression model. Output from this model was used to construct funnel plots.ResultsA total of 7212 live born infants <32 weeks gestation were admitted consecutively to network NICUs during the study period. NICUs differed in their patient populations and severity of illness.The overall unadjusted hospital mortality rate for the network was 7.9% (n=572 deaths). This varied from 5.3% in hospital E to 10.4% in hospital C. Adjusted mortality rates showed little CTC variation. No hospital reached the +99.8% control limit level on adjusted funnel plots.ConclusionCharacteristics of infants admitted to NICUs differ, and comparing unadjusted mortality rates should be avoided. Logistic regression-derived risk-adjusted mortality rates plotted on funnel plots provide a powerful visual graphical tool for presenting quality performance data. CTC variation is readily identified, permitting hospitals to appraise their practices and start timely intervention.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.