Background Predicting hospital length of stay (LoS) for patients with COVID-19 infection is essential to ensure that adequate bed capacity can be provided without unnecessarily restricting care for patients with other conditions. Here, we demonstrate the utility of three complementary methods for predicting LoS using UK national- and hospital-level data. Method On a national scale, relevant patients were identified from the COVID-19 Hospitalisation in England Surveillance System (CHESS) reports. An Accelerated Failure Time (AFT) survival model and a truncation corrected method (TC), both with underlying Weibull distributions, were fitted to the data to estimate LoS from hospital admission date to an outcome (death or discharge) and from hospital admission date to Intensive Care Unit (ICU) admission date. In a second approach we fit a multi-state (MS) survival model to data directly from the Manchester University NHS Foundation Trust (MFT). We develop a planning tool that uses LoS estimates from these models to predict bed occupancy. Results All methods produced similar overall estimates of LoS for overall hospital stay, given a patient is not admitted to ICU (8.4, 9.1 and 8.0 days for AFT, TC and MS, respectively). Estimates differ more significantly between the local and national level when considering ICU. National estimates for ICU LoS from AFT and TC were 12.4 and 13.4 days, whereas in local data the MS method produced estimates of 18.9 days. Conclusions Given the complexity and partiality of different data sources and the rapidly evolving nature of the COVID-19 pandemic, it is most appropriate to use multiple analysis methods on multiple datasets. The AFT method accounts for censored cases, but does not allow for simultaneous consideration of different outcomes. The TC method does not include censored cases, instead correcting for truncation in the data, but does consider these different outcomes. The MS method can model complex pathways to different outcomes whilst accounting for censoring, but cannot handle non-random case missingness. Overall, we conclude that data-driven modelling approaches of LoS using these methods is useful in epidemic planning and management, and should be considered for widespread adoption throughout healthcare systems internationally where similar data resources exist.
Protection against disclosure is important for statistical agencies releasing microdata files from sample surveys. Simple measures of disclosure risk can provide useful evidence to support decisions about release. We propose a new measure of disclosure risk: the probability that a unique match between a microdata record and a population unit is correct. We argue that this measure has at least two advantages. First, we suggest that it may be a more realistic measure of risk than two measures that are currently used with census data. Second, we show that consistent inference (in a specified sense) may be made about this measure from sample data without strong modelling assumptions. This is a surprising finding, in its contrast with the properties of the two 'similar' established measures. As a result, this measure has potentially useful applications to sample surveys. In addition to obtaining a simple consistent predictor of the measure, we propose a simple variance estimator and show that it is consistent. We also consider the extension of inference to allow for certain complex sampling schemes. We present a numerical study based on 1991 census data for about 450 000 enumerated individuals in one area of Great Britain. We show that the theoretical results on the properties of the point predictor of the measure of risk and its variance estimator hold to a good approximation for these data. Copyright 2002 Royal Statistical Society.
Anonymisation of personal data has a long history stemming from the expansion of the types of data products routinely provided by National Statistical Institutes. Variants on anonymisation have received serious criticism reinforced by much-publicised apparent failures. We argue that both the operators of such schemes and their critics have become confused by being overly focused on the properties of the data themselves. We claim that, far from being able to determine whether data are anonymous (and therefore non-personal) by looking at the data alone, any anonymisation technique worthy of the name must take account of not only the data but also their environment. This paper proposes an alternative formulation called functional anonymisation that focuses on the relationship between the data and the environment within which the data exist (their data environment). We provide a formulation for describing the relationship between the data and their environment that links the legal notion of personal data with the statistical notion of disclosure control. Anonymisation, properly conceived and effectively conducted, can be a critical part of the toolkit of the privacy-respecting data controller and the wider remit of providing accurate and usable data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.