The multiple testing procedure plays an important role in detecting the presence of spatial signals for large scale imaging data. Typically, the spatial signals are sparse but clustered. This paper provides empirical evidence that for a range of commonly used control levels, the conventional FDR procedure can lack the ability to detect statistical significance, even if the p-values under the true null hypotheses are independent and uniformly distributed; more generally, ignoring the neighboring information of spatially structured data will tend to diminish the detection effectiveness of the FDR procedure. This paper first introduces a scalar quantity to characterize the extent to which the “lack of identification phenomenon” (LIP) of the FDR procedure occurs. Second, we propose a new multiple comparison procedure, called FDRL, to accommodate the spatial information of neighboring p-values, via a local aggregation of p-values. Theoretical properties of the FDRL procedure are investigated under weak dependence of p-values. It is shown that the FDRL procedure alleviates the LIP of the FDR procedure, thus substantially facilitating the selection of more stringent control levels. Simulation evaluations indicate that the FDRL procedure improves the detection sensitivity of the FDR procedure with little loss in detection specificity. The computational simplicity and detection effectiveness of the FDRL procedure are illustrated through a real brain fMRI dataset.
Functional magnetic resonance imaging (fMRI) aims to locate activated regions
in human brains when specific tasks are performed. The conventional tool for
analyzing fMRI data applies some variant of the linear model, which is
restrictive in modeling assumptions. To yield more accurate prediction of the
time-course behavior of neuronal responses, the semiparametric inference for
the underlying hemodynamic response function is developed to identify
significantly activated voxels. Under mild regularity conditions, we
demonstrate that a class of the proposed semiparametric test statistics, based
on the local linear estimation technique, follow $\chi^2$ distributions under
null hypotheses for a number of useful hypotheses. Furthermore, the asymptotic
power functions of the constructed tests are derived under the fixed and
contiguous alternatives. Simulation evaluations and real fMRI data application
suggest that the semiparametric inference procedure provides more efficient
detection of activated brain areas than the popular imaging analysis tools AFNI
and FSL.Comment: Published in at http://dx.doi.org/10.1214/07-AOS519 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
High-throughput studies have been extensively conducted in the research of complex human diseases. As a representative example, consider gene-expression studies where thousands of genes are profiled at the same time. An important objective of such studies is to rank the diagnostic accuracy of biomarkers (e.g. gene expressions) for predicting outcome variables while properly adjusting for confounding effects from low-dimensional clinical risk factors and environmental exposures. Existing approaches are often fully based on parametric or semi-parametric models and target evaluating estimation significance as opposed to diagnostic accuracy. Receiver operating characteristic (ROC) approaches can be employed to tackle this problem. However, existing ROC ranking methods focus on biomarkers only and ignore effects of confounders. In this article, we propose a model-based approach which ranks the diagnostic accuracy of biomarkers using ROC measures with a proper adjustment of confounding effects. To this end, three different methods for constructing the underlying regression models are investigated. Simulation study shows that the proposed methods can accurately identify biomarkers with additional diagnostic power beyond confounders. Analysis of two cancer gene-expression studies demonstrates that adjusting for confounders can lead to substantially different rankings of genes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.