Fitting parametric random effects models in very large data sets with application to VHA national data

Gebregziabher, Mulugeta; Egede, Leonard E.; Gilbert, Gregory E.; Hunt, Kelly J.; Nietert, Paul J.; Mauldin, Patrick D.

doi:10.1186/1471-2288-12-163

Cited by 14 publications

(16 citation statements)

References 43 publications

(49 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…One of the major uses for clinical big data is in analysis of the prevalence or trends of a disease or phenotype among different populations. An early big data study evaluated a cohort consisting of 890,394 US veterans with diabetes followed from 2002 through 2006 [ 43 ]. Bermejo-Sanchez et al [ 44 ] observed 326 of the birth defect Amelia among 23 million live births, stillbirths, and fetal anomalies from 23 countries and 4 continents, and found the trend of higher prevalence of Amelia among younger mothers.…”

Section: Resultsmentioning

confidence: 99%

“…Gebregziabher et al [ 43 ] stated that the datasets generated through many translational research projects to answer questions of public health interest are not self-explanatory due to complexity and inadequate description/documentation of the dataset's parameters and associated metadata. The methodologies for interpreting the data can therefore be subject to all sorts of philosophical debate.…”

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

Big Data and Clinicians: A Review on the State of the Science

Wang

Krishnan²

2014

JMIR Med Inform

121

View full text Add to dashboard Cite

BackgroundIn the past few decades, medically related data collection saw a huge increase, referred to as big data. These huge datasets bring challenges in storage, processing, and analysis. In clinical medicine, big data is expected to play an important role in identifying causality of patient symptoms, in predicting hazards of disease incidence or reoccurrence, and in improving primary-care quality.ObjectiveThe objective of this review was to provide an overview of the features of clinical big data, describe a few commonly employed computational algorithms, statistical methods, and software toolkits for data manipulation and analysis, and discuss the challenges and limitations in this realm.MethodsWe conducted a literature review to identify studies on big data in medicine, especially clinical medicine. We used different combinations of keywords to search PubMed, Science Direct, Web of Knowledge, and Google Scholar for literature of interest from the past 10 years.ResultsThis paper reviewed studies that analyzed clinical big data and discussed issues related to storage and analysis of this type of data.ConclusionsBig data is becoming a common feature of biological and clinical studies. Researchers who use clinical big data face multiple challenges, and the data itself has limitations. It is imperative that methodologies for data analysis keep pace with our ability to collect and store data.

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Discussionmentioning

confidence: 99%

Big Data and Clinicians: A Review on the State of the Science

Wang

Krishnan²

2014

JMIR Med Inform

121

View full text Add to dashboard Cite

show abstract

“…In wound care, common examples include the unit of analysis (e.g., multiple wounds per patient) or patients treated at different investigational sites or settings or by different clinicians. 37,38 Any time significant clustering is encountered in a dataset, not adjusting for the clustering effects is likely to lead to overestimation of the effect size of the intervention or parameter under study. Mixed models in which such clusters are treated as random effects are the most frequently methods used for adjustment, but these methods are computationally intensive, and may be impossible in very big datasets due to available computing power or memory.…”

Section: Other Analytical Issuesmentioning

confidence: 99%

“…Mixed models in which such clusters are treated as random effects are the most frequently methods used for adjustment, but these methods are computationally intensive, and may be impossible in very big datasets due to available computing power or memory. 38 Standard of care (SOC) is likely to vary considerably in large datasets but nevertheless can have a large impact on wound healing. The TIME algorithm, developed from a meeting of wound care experts, outlines an ideal SOC that includes tissue management, infection, moisture imbalance, and edge of the wound.…”

Section: Other Analytical Issuesmentioning

confidence: 99%

Harnessing electronic healthcare data for wound care research: Wound registry analytic guidelines for less‐biased analyses

Carter

2017

Wound Repair Regeneration

View full text Add to dashboard Cite

Publications based on large healthcare databases that contain data pertaining to wound-related outcomes are starting to appear more frequently. However, concern exists in regard to study design adequacy, the methodology used to minimize misclassifications, bias, and confounding, and lack of full reporting. The STROBE guidelines were published to encourage fuller reporting of observational studies and have now been extended using the RECORD statement to better document routinely collected healthcare data. In this paper, elements of the RECORD statement have been used to create guidelines for study design, cohort matching, reporting criteria, and analysis frameworks in regard to analyses of populations involving comparative effectiveness research. It is recommended that researchers present full data analysis with minimal inclusion and exclusion criteria and preplanned subgroups analyses rather than attempt to emulate randomized controlled trials, as patterns of product administration are likely to be vastly different to those using controlled trials; moreover, missing data are very common. Suggestions for creating better matched cohorts, classification of wound- and patient-related variables, and a rationale for reporting at a minimum a particular set of benchmarks to better characterize wound care populations is also presented. Adherence to these guidelines would improve the credibility of studies and make comparisons between studies much easier. Finally, an adaptation of the Cochrane risk of bias tool is presented in connection with the proposed guidelines for systematic reviewers to assess these kinds of retrospective studies.

show abstract

“…In general situations, one can partition the data between multiple processors, compute separate parameter estimates for each chunk and then combine the results (Huang and Gelman, ; Gebregziabher et al ., ; Khanna et al ., ; Scott et al ., ). These splitting strategies often require the same total computational cost, but they split the costs between K processors, reducing wall clock time by a factor of K .…”

Section: Introductionmentioning

confidence: 99%

Fast Moment-Based Estimation for Hierarchical Models

Perry

2016

Journal of the Royal Statistical Society Series B: Statistical Methodology

View full text Add to dashboard Cite

Hierarchical models allow for heterogeneous behaviours in a population while simultaneously borrowing estimation strength across all subpopulations. Unfortunately, existing likelihood-based methods for fitting hierarchical models have high computational demands, and these demands have limited their adoption in large-scale prediction and inference problems. The paper proposes a moment-based procedure for estimating the parameters of a hierarchical model which has its roots in a method originally introduced by Cochran in 1937. The method trades statistical efficiency for computational efficiency. It gives consistent parameter estimates, competitive prediction error performance and substantial computational improvements. When applied to a large-scale recommender system application and compared with a standard maximum likelihood procedure, the method delivers competitive prediction performance while reducing the sequential computation time from hours to minutes.

show abstract

Fitting parametric random effects models in very large data sets with application to VHA national data

Cited by 14 publications

References 43 publications

Big Data and Clinicians: A Review on the State of the Science

Big Data and Clinicians: A Review on the State of the Science

Harnessing electronic healthcare data for wound care research: Wound registry analytic guidelines for less‐biased analyses

Fast Moment-Based Estimation for Hierarchical Models

Contact Info

Product

Resources

About