We propose a two-sample test for the means of high-dimensional data when the data dimension is much larger than the sample size. Hotelling's classical T 2 test does not work for this "large p, small n" situation. The proposed test does not require explicit conditions in the relationship between the data dimension and sample size. This offers much flexibility in analyzing high-dimensional data. An application of the proposed test is in testing significance for sets of genes which we demonstrate in an empirical study on a leukemia data set.
We propose two tests for the equality of covariance matrices between two
high-dimensional populations. One test is on the whole variance--covariance
matrices, and the other is on off-diagonal sub-matrices, which define the
covariance between two nonoverlapping segments of the high-dimensional random
vectors. The tests are applicable (i) when the data dimension is much larger
than the sample sizes, namely the "large $p$, small $n$" situations and (ii)
without assuming parametric distributions for the two populations. These two
aspects surpass the capability of the conventional likelihood ratio test. The
proposed tests can be used to test on covariances associated with gene ontology
terms.Comment: Published in at http://dx.doi.org/10.1214/12-AOS993 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
By learning the PM
2.5
readings and meteorological records from 2010–2015, the severity of PM
2.5
pollution in Beijing is quantified with a set of statistical measures. As PM
2.5
concentration is highly influenced by meteorological conditions, we propose a statistical approach to adjust PM
2.5
concentration with respect to meteorological conditions, which can be used to monitor PM
2.5
pollution in a location. The adjusted monthly averages and percentiles are employed to test if the PM
2.5
levels in Beijing have been lowered since China's State Council set up a pollution reduction target. The results of the testing reveal significant increases, rather than decreases, in the PM
2.5
concentrations in the years 2013 and 2014 as compared with those in year 2012. We conduct analyses on two quasi-experiments—the Asia-Pacific Economic Cooperation meeting in November 2014 and the annual winter heating—to gain insight into the impacts of emissions on PM
2.5
. The analyses lead to a conclusion that a fundamental shift from mainly coal-based energy consumption to much greener alternatives in Beijing and the surrounding North China Plain is the key to solving the PM
2.5
problem in Beijing.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.