We develop asymptotic theory for weighted likelihood estimators (WLE) under two-phase stratified sampling without replacement. We also consider several variants of WLEs involving estimated weights and calibration. A set of empirical process tools are developed including a Glivenko–Cantelli theorem, a theorem for rates of convergence of M-estimators, and a Donsker theorem for the inverse probability weighted empirical processes under two-phase sampling and sampling without replacement at the second phase. Using these general results, we derive asymptotic distributions of the WLE of a finite-dimensional parameter in a general semiparametric model where an estimator of a nuisance parameter is estimable either at regular or nonregular rates. We illustrate these results and methods in the Cox model with right censoring and interval censoring. We compare the methods via their asymptotic variances under both sampling without replacement and the more usual (and easier to analyze) assumption of Bernoulli sampling at the second phase.
We introduce a general framework for estimation of inverse covariance, or precision, matrices from heterogeneous populations. The proposed framework uses a Laplacian shrinkage penalty to encourage similarity among estimates from disparate, but related, subpopulations, while allowing for differences among matrices. We propose an efficient alternating direction method of multipliers (ADMM) algorithm for parameter estimation, as well as its extension for faster computation in high dimensions by thresholding the empirical covariance matrix to identify the joint block diagonal structure in the estimated precision matrices. We establish both variable selection and norm consistency of the proposed estimator for distributions with exponential or polynomial tails. Further, to extend the applicability of the method to the settings with unknown populations structure, we propose a Laplacian penalty based on hierarchical clustering, and discuss conditions under which this data-driven choice results in consistent estimation of precision matrices in heterogenous populations. Extensive numerical studies and applications to gene expression data from subtypes of cancer with distinct clinical outcomes indicate the potential advantages of the proposed method over existing approaches.
The two-phase design has recently received attention in the statistical literature as an extension to the traditional case-control study for settings where a predictor of interest is rare or subject to missclassification. Despite a thorough methodological treatment and the potential for substantial efficiency gains, the two-phase design has not been widely adopted. This may be due, in part, to a lack of general-purpose, readily-available software. The osDesign package for R provides a suite of functions for analyzing data from a two-phase and/or case-control design, as well as evaluating operating characteristics, including bias, efficiency and power. The evaluation is simulation-based, permitting flexible application of the package to a broad range of scientific settings. Using lung cancer mortality data from Ohio, the package is illustrated with a detailed case-study in which two statistical goals are considered: (i) the evaluation of small-sample operating characteristics for two-phase and case-control designs and (ii) the planning and design of a future two-phase study.
Summary The log-rank test has been widely used to test treatment effects under the Cox model for censored time-to-event outcomes, though it may lose power substantially when the model’s proportional hazards assumption does not hold. In this paper, we consider an extended Cox model that uses B-splines or smoothing splines to model a time-varying treatment effect and propose score test statistics for the treatment effect. Our proposed new tests combine statistical evidence from both the magnitude and the shape of the time-varying hazard ratio function, and thus are omnibus and powerful against various types of alternatives. In addition, the new testing framework is applicable to any choice of spline basis functions, including B-splines and smoothing splines. Simulation studies confirm that the proposed tests performed well in finite samples and were frequently more powerful than conventional tests alone in many settings. The new methods were applied to the HIVNET 012 Study, a randomized clinical trial to assess the efficacy of single-dose Nevirapine against mother-to-child HIV transmission conducted by the HIV Prevention Trial Network.
We consider the variance estimation of the weighted likelihood estimator (WLE) under two‐phase stratified sampling without replacement. Asymptotic variance of the WLE in many semiparametric models contains unknown functions or does not have a closed form. The standard method of the inverse probability weighted (IPW) sample variances of an estimated influence function is then not available in these models. To address this issue, we develop the variance estimation procedure for the WLE in a general semiparametric model. The phase I variance is estimated by taking a numerical derivative of the IPW log likelihood. The phase II variance is estimated based on the bootstrap for a stratified sample in a finite population. Despite a theoretical difficulty of dependent observations due to sampling without replacement, we establish the (bootstrap) consistency of our estimators. Finite sample properties of our method are illustrated in a simulation study.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.