Time series data collected in clinical trials can have varying degrees of missingness, adding challenges during statistical analyses. An additional layer of complexity is introduced for missing data in randomized controlled trials (RCT), where researchers must remain blinded between intervention and control groups. Such restriction severely limits the applicability of conventional imputation methods that would utilize other participants’ data for improved performance. This paper explores and compares various methods to impute high-resolution temperature logger data in RCT settings. In addition to the conventional non-parametric approaches, we propose a spline regression (SR) approach that captures the dynamics of indoor temperature by time of day that is unique to each participant. We investigate how the inclusion of external temperature and energy use can improve the model performance. Results show that SR imputation results in 16% smaller root mean squared error (RMSE) compared to conventional imputation methods, with the gap widening to 22% when more than half of data is missing. The SR method is particularly useful in cases where missingness occurs simultaneously for multiple participants, such as concurrent battery failures. We demonstrate how proper modelling of periodic dynamics can lead to significantly improved imputation performance, even with limited data.
Occupant behaviour plays a significant role in shaping the dynamics of energy consumption in buildings, but the complex nature of occupant behaviour has hindered a deeper understanding of its influence. A meta-analysis was conducted on 65 published studies that used data-driven quantitative assessments to assess energy-related occupant behaviour using the Knowledge Discovery and Data Mining (KDD) framework. Hierarchical clustering was utilised to categorise different modelling techniques based on the intended outcomes of the model and the types of parameters used in various models. This study will assist researchers in selecting the most appropriate parameters and methods under various data constraints and research questions. The research revealed two distinct model categories being used to study occupant behaviour-driven energy consumption, namely (i) occupancy status models and (ii) energy-related behaviour models. Multiple studies have identified limitations on data collection and privacy concerns as constraints of modelling occupant behaviour in residential buildings. The “regression model” and its variants were found to be the preferred model types for research that models “energy-related behaviour”, and “classification models” were found to be preferable for modelling “occupancy” status. There were only limited instances of data-driven studies that modelled occupant behaviour in low-income households, and there is a need to generate region-specific models to accurately model energy-related behaviour.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.