Abstract:In linear mixed‐effects models, random effects are used to capture the heterogeneity and variability between individuals due to unmeasured covariates or unknown biological differences. Testing for the need of random effects is a nonstandard problem because it requires testing on the boundary of parameter space where the asymptotic chi‐squared distribution of the classical tests such as likelihood ratio and score tests is incorrect. In the literature several tests have been proposed to overcome this difficulty,… Show more
“…As a consequence, there is no open set containing the true variance components under the null hypothesis. Therefore, the classical asymptotic chi-squared distribution of the likelihood ratio (LR) or restricted LR test statistic is not valid (see, for example, Stram and Lee, 1994;Drikvandi et al, 2012Drikvandi et al, , 2013Drikvandi and Noorian, 2019). For testing zero variance components, it is shown that the correct asymptotic distribution of the LR or restricted LR statistic is a mixture of chi-squared distributions, provided that the response variable can be partitioned into independent subvectors and the number of subvectors tends to infinity (e.g., Stram and Lee, 1994).…”
Section: Testing For a Polynomial Fit Versus A Penalised Spline Smoothermentioning
Standard models for longitudinal data ignore the stochastic nature of time-varying covariates and their stochastic evolution over time by treating them as fixed variables. There have been recent methods for modelling time-varying covariates, however those methods cannot be applied to analyse longitudinal data when the longitudinal response and the time-varying covariates for each subject are measured at different time points. Moreover, it is difficult to study the temporal effects of a time-varying covariate on the longitudinal response and the temporal correlation between them. Motivated by data from an AIDS cohort study conducted over 26 years at the University Hospitals Leuven in which the measurements on the CD4 cell count and viral load for patients are not taken at the same time point, we present a framework to address those challenges by using joint multivariate mixed models to jointly model time-varying covariates and a longitudinal response, instead of including time-varying covariates in the response model. This approach also has the advantage that one can study the association between the covariate at any time point and the response at any other time point, without having to explicitly model the conditional distribution of the response given the covariate. We use penalised spline functions of time to capture the evolutions of both the response and time-varying covariates over time.
“…As a consequence, there is no open set containing the true variance components under the null hypothesis. Therefore, the classical asymptotic chi-squared distribution of the likelihood ratio (LR) or restricted LR test statistic is not valid (see, for example, Stram and Lee, 1994;Drikvandi et al, 2012Drikvandi et al, , 2013Drikvandi and Noorian, 2019). For testing zero variance components, it is shown that the correct asymptotic distribution of the LR or restricted LR statistic is a mixture of chi-squared distributions, provided that the response variable can be partitioned into independent subvectors and the number of subvectors tends to infinity (e.g., Stram and Lee, 1994).…”
Section: Testing For a Polynomial Fit Versus A Penalised Spline Smoothermentioning
Standard models for longitudinal data ignore the stochastic nature of time-varying covariates and their stochastic evolution over time by treating them as fixed variables. There have been recent methods for modelling time-varying covariates, however those methods cannot be applied to analyse longitudinal data when the longitudinal response and the time-varying covariates for each subject are measured at different time points. Moreover, it is difficult to study the temporal effects of a time-varying covariate on the longitudinal response and the temporal correlation between them. Motivated by data from an AIDS cohort study conducted over 26 years at the University Hospitals Leuven in which the measurements on the CD4 cell count and viral load for patients are not taken at the same time point, we present a framework to address those challenges by using joint multivariate mixed models to jointly model time-varying covariates and a longitudinal response, instead of including time-varying covariates in the response model. This approach also has the advantage that one can study the association between the covariate at any time point and the response at any other time point, without having to explicitly model the conditional distribution of the response given the covariate. We use penalised spline functions of time to capture the evolutions of both the response and time-varying covariates over time.
“…Again, care ought to be taken when calculating the caught variance with associated or correlated loadings. Note that assessing and testing a significant variance in correlated models is a nonstandard testing problem [11][12][13][14].…”
High dimensional data are rapidly growing in many different disciplines, particularly in natural language processing. The analysis of natural language processing requires working with high dimensional matrices of word embeddings obtained from text data. Those matrices are often sparse in the sense that they contain many zero elements. Sparse principal component analysis is an advanced mathematical tool for the analysis of high dimensional data. In this paper, we study and apply the sparse principal component analysis for natural language processing, which can effectively handle large sparse matrices. We study several formulations for sparse principal component analysis, together with algorithms for implementing those formulations. Our work is motivated and illustrated by a real text dataset. We find that the sparse principal component analysis performs as good as the ordinary principal component analysis in terms of accuracy and precision, while it shows two major advantages: faster calculations and easier interpretation of the principal components. These advantages are very helpful especially in big data situations.
“…More recently, this approach has been considered by several authors in conjunction with empirical Bayesian and permutation test (e.g. [24]) while Drikvandi and Noorian [7] have considered the permutation test for a more broad class of linear mixed models with correlated errors. The results were shown that both tests to perform well, albeit the permutation test with the likelihood ratio statistic tends to provide a relatively higher power when testing multiple random effects.…”
In the past decade, mixed-effects modeling has received a great deal of attention in the applied and theoretical statistical literature. They are very flexible tools in analyzing repeated measures, panel data, cross-sectional data, and hierarchical data. However, the complex nature of these models has motivated researchers to study different aspects of this problem. One of which is to test the significance of random effects used to model unobserved heterogeneity in the population. The method of likelihood ratio test based on the normality assumption of the error term and random effects has been proposed. However, this assumption does not necessarily hold in practice. In this paper, we propose an optimal test based on the so-called uniform local asymptotic normality to detect the possible presence of random effects in linear mixed models. We show that the proposed test statistic is, consistent, locally asymptotically optimal even for a model that does not require the traditional assumption of normality and is comparable to the classical L.ratiotest when the standard assumptions are met. Finally, simulation studies and real data analysis are also conducted to empirically examine the performance of this procedure.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.