Recently, independent component analysis (ICA) has been widely used in the analysis of brain imaging data. An important problem with most ICA algorithms is, however, that they are stochastic; that is, their results may be somewhat different in different runs of the algorithm. Thus, the outputs of a single run of an ICA algorithm should be interpreted with some reserve, and further analysis of the algorithmic reliability of the components is needed. Moreover, as with any statistical method, the results are affected by the random sampling of the data, and some analysis of the statistical significance or reliability should be done as well. Here we present a method for assessing both the algorithmic and statistical reliability of estimated independent components. The method is based on running the ICA algorithm many times with slightly different conditions and visualizing the clustering structure of the obtained components in the signal space. In experiments with magnetoencephalographic (MEG) and functional magnetic resonance imaging (fMRI) data, the method was able to show that expected components are reliable; furthermore, it pointed out components whose interpretation was not obvious but whose reliability should incite the experimenter to investigate the underlying technical or physical phenomena. The method is implemented in a software package called Icasso.
Abstract. A major problem in application of independent component analysis (ICA) is that the reliability of the estimated independent components is not known. Firstly, the finite sample size induces statistical errors in the estimation. Secondly, as real data never exactly follows the ICA model, the contrast function used in the estimation may have many local minima which are all equally good, or the practical algorithm may not always perform properly, for example getting stuck in local minima with strongly suboptimal values of the contrast function. We present an explorative visualization method for investigating the relations between estimates from FastICA. The algorithmic and statistical reliability is investigated by running the algorithm many times with different initial values or with differently bootstrapped data sets, respectively. Resulting estimates are compared by visualizing their clustering according to a suitable similarity measure. Reliable estimates correspond to tight clusters, and unreliable ones to points which do not belong to any such cluster. We have developed a software package called Icasso to implement these operations. We also present results of this method when applying Icasso on biomedical data.
DNA copy number amplifications activate oncogenes and are hallmarks of nearly all advanced tumors. Amplified genes represent attractive targets for therapy, diagnostics and prognostics. To investigate DNA amplifications in different neoplasms, we performed a bibliomics survey using 838 published chromosomal comparative genomic hybridization studies and collected amplification data at chromosome band resolution from more than 4500 cases. Amplification profiles were determined for 73 distinct neoplasms. Neoplasms were clustered according to the amplification profiles, and frequently amplified chromosomal loci (amplification hot spots) were identified using computational modeling. To investigate the site specificity and mechanisms of gene amplifications, colocalization of amplification hot spots, cancer genes, fragile sites, virus integration sites and gene size cohorts were tested in a statistical framework. Amplification-based clustering demonstrated that cancers with similar etiology, cell-oforigin or topographical location have a tendency to obtain convergent amplification profiles. The identified amplification hot spots were colocalized with the known fragile sites, cancer genes and virus integration sites, but global statistical significance could not be ascertained. Large genes were significantly overrepresented on the fragile sites and the reported amplification hot spots. These findings indicate that amplifications are selected in the cancer tissue environment according to the qualitative traits and localization of cancer genes.
The PSORS1 locus is the consistently replicated genetic risk factor for psoriasis. Clinical associations with the main marker allele of PSORS1, HLA-Cw6, have been addressed in a number of studies, but clinical associations have not been used as a way to distinguish the effects of the neighbouring candidate genes in PSORS1. Our results show that HLA-Cw6 and CCHCR1 risk allele associations with clinical features of psoriasis are predictably highly similar in a Finnish nationwide cohort of 379 psoriasis patients. The clinical profiling of a small group of patients (n=34) who were HLA-Cw6- but CCHCR1*WWCC positive suggested that no great differences existed between them and HCR-Cw6- patients. HCR+ genotype (as well as Cw6+ genotype) correlated for the first time positively with female sex and, in contrast with previous studies, negatively with disease severity. Presence of psoriatic arthritis was more pronounced in HCR- psoriasis (as well as in Cw6- psoriasis). Clinical profiling may be a useful approach to distinguishing genetic effects of candidate genes even within a locus in sufficiently large cohorts.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.