Yinqiu He scite author profile

Wilk’s theorem, which offers universal chi-squared approximations for likelihood ratio tests, is widely used in many scientific hypothesis testing problems. For modern datasets with increasing dimension, researchers have found that the conventional Wilk’s phenomenon of the likelihood ratio test statistic often fails. Although new approximations have been proposed in high dimensional settings, there still lacks a clear statistical guideline regarding how to choose between the conventional and newly proposed approximations, especially for moderate-dimensional data. To address this issue, we develop the necessary and sufficient phase transition conditions for Wilk’s phenomenon under popular tests on multivariate mean and covariance structures. Moreover, we provide an in-depth analysis of the accuracy of chi-squared approximations by deriving their asymptotic biases. These results may provide helpful insights into the use of chi-squared approximations in scientific practices.

show abstract

Likelihood Ratio Test in Multivariate Linear Regression: from Low to High Dimension

He¹,

Jiang²,

Wen³

et al. 2021

STAT SINICA

View full text Add to dashboard Cite

Multivariate linear regressions are widely used statistical tools in many applications to model the associations between multiple related responses and a set of predictors. To infer such associations, it is often of interest to test the structure of the regression coefficients matrix, and the likelihood ratio test (LRT) is one of the most popular approaches in practice.Despite its popularity, it is known that the classical χ 2 approximations for LRTs often fail in high-dimensional settings, where the dimensions of responses and predictors (m, p) are allowed to grow with the sample size n. Though various corrected LRTs and other test statistics have been proposed in the literature, the important question of when the classic LRT starts to fail is less studied; an answer to this would provide insights for practitioners, especially when analyzing data with m/n and p/n small but not negligible. Moreover, the power performance of the LRT in high-dimensional data analysis remains underexplored. To address these issues, the first part of this work gives the asymptotic boundary where the classical LRT fails and develops the corrected limiting distribution of the LRT for a general asymptotic regime. The second part of this work further studies the test power of the LRT in the high-dimensional settings. The result not only advances the current understanding arXiv:1812.06894v2 [math.ST] 3 Oct 2019 of asymptotic behavior of the LRT under alternative hypothesis, but also motivates the development of a power-enhanced LRT. The third part of this work considers the settingwith p > n, where the LRT is not well-defined. We propose a two-step testing procedure by first performing dimension reduction and then applying the proposed LRT. Theoretical properties are developed to ensure the validity of the proposed method. Numerical studies are also presented to demonstrate its good performance.

show abstract

Speeding up Monte Carlo simulations for the adaptive sum of powered score test with importance sampling

Deng

et al. 2020

Biometrics

View full text Add to dashboard Cite

A central but challenging problem in genetic studies is to test for (usually weak) associations between a complex trait (e.g. a disease status) and sets of multiple genetic variants. Due to the lack of a uniformly most powerful test, data-adaptive tests, such as the adaptive sum of powered score (aSPU) test, are advantageous in maintaining high power against a wide range of alternatives.However, there is often no closed-form to accurately and analytically calculate the p-values of many adaptive tests like aSPU, thus Monte Carlo (MC) simulations are often used, which can be time-consuming to achieve a stringent significance level (e.g. 5e-8) used in GWAS. To estimate such a small p-value, we need a huge number of MC simulations (e.g. 1e+10). As an alternative, we propose using importance sampling to speed up such calculations. We develop some theory to motivate a This article is protected by copyright. All rights reserved.proposed algorithm for the aSPU test, and show that the proposed method is computationally more efficient than the standard MC simulations. Using both simulated and real data, we demonstrate the superior performance of the new method over the standard MC simulations.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yinqiu He

Asymptotically independent U-statistics in high-dimensional testing

On the phase transition of Wilks’ phenomenon

Likelihood Ratio Test in Multivariate Linear Regression: from Low to High Dimension

Speeding up Monte Carlo simulations for the adaptive sum of powered score test with importance sampling

Contact Info

Product

Resources

About