Models of water resources systems are conceived to capture the underlying environmental dynamics occurring within watersheds. All such models can be regarded as working hypotheses, differing in the aspects of process representation and conceptualization. Most of the associated efforts in the water resources research community is dedicated to development of new models that perform well under specific atmospheric conditions and catchment properties. In this context, flexible modeling frameworks are gaining importance as they facilitate the model building process by providing the model building blocks, whereby the hydrologist is free to assemble the model for task at hand. Such flexible models have high degree of transferability, which in turn aid in progressing toward a unified hydrological theory at catchment scale. However, in cases without sufficient insights regarding a catchment characteristics and/or lack of expert's knowledge, one may have to try a large number of model configurations based on available model building blocks to construct an appropriate model for the catchment of interest. Undoubtedly, this may be time consuming and computationally intensive. This paper proposes a novel model building algorithm, which uses the full potential of flexible modeling frameworks by searching the model space and inferring suitable model configurations relying on machine learning. Proposed machine learning algorithm is based on evolutionary computation approach using genetic programming (GP). State-of-art GP applications in rainfall-runoff modeling so far used the algorithm as a short-term forecasting tool that generates an expected future time series very similar to neural networks application. In this case, the proposed algorithm develops a physically meaningful rainfall-runoff model. Although at the moment we learn models using two flexible modeling frameworks (SUPERFLEX and FUSE), the model induction toolkit can be armed with any internal coherence building blocks. The model induction capabilities of the proposed framework have been evaluated on the Blackwater River basin, Alabama, United States. The model configurations evolved through the model induction toolkit are consistent with the fieldwork investigations and previously reported research findings. Fixed Models Versus Flexible ModelsDesign of conceptual models traditionally begins with a perceptual model derived from insights gained on the basis of fieldwork and experience, proceeding through a mathematical formulation of the hypothesized structure to the numerically robust implementation in a computer code . Conceptual hydrological modeling can be broadly classified into single-and multiple-hypothesis (often referred to as flexible) modeling approaches.Development of models with fixed structure is based on the identification of a general model structure that is physically realistic and applicable to a reasonably wide range of catchments and climatic conditions. Several
Abstract. Despite showing great success of applications in many commercial fields, machine learning and data science models generally show limited success in many scientific fields, including hydrology (Karpatne et al., 2017). The approach is often criticized for its lack of interpretability and physical consistency. This has led to the emergence of new modelling paradigms, such as theory-guided data science (TGDS) and physics-informed machine learning. The motivation behind such approaches is to improve the physical meaningfulness of machine learning models by blending existing scientific knowledge with learning algorithms. Following the same principles in our prior work (Chadalawada et al., 2020), a new model induction framework was founded on genetic programming (GP), namely the Machine Learning Rainfall–Runoff Model Induction (ML-RR-MI) toolkit. ML-RR-MI is capable of developing fully fledged lumped conceptual rainfall–runoff models for a watershed of interest using the building blocks of two flexible rainfall–runoff modelling frameworks. In this study, we extend ML-RR-MI towards inducing semi-distributed rainfall–runoff models. The meaningfulness and reliability of hydrological inferences gained from lumped models may tend to deteriorate within large catchments where the spatial heterogeneity of forcing variables and watershed properties is significant. This was the motivation behind developing our machine learning approach for distributed rainfall–runoff modelling titled Machine Induction Knowledge Augmented – System Hydrologique Asiatique (MIKA-SHA). MIKA-SHA captures spatial variabilities and automatically induces rainfall–runoff models for the catchment of interest without any explicit user selections. Currently, MIKA-SHA learns models utilizing the model building components of two flexible modelling frameworks. However, the proposed framework can be coupled with any internally coherent collection of building blocks. MIKA-SHA's model induction capabilities have been tested on the Rappahannock River basin near Fredericksburg, Virginia, USA. MIKA-SHA builds and tests many model configurations using the model building components of the two flexible modelling frameworks and quantitatively identifies the optimal model for the watershed of concern. In this study, MIKA-SHA is utilized to identify two optimal models (one from each flexible modelling framework) to capture the runoff dynamics of the Rappahannock River basin. Both optimal models achieve high-efficiency values in hydrograph predictions (both at catchment and subcatchment outlets) and good visual matches with the observed runoff response of the catchment. Furthermore, the resulting model architectures are compatible with previously reported research findings and fieldwork insights of the watershed and are readily interpretable by hydrologists. MIKA-SHA-induced semi-distributed model performances were compared against existing lumped model performances for the same basin. MIKA-SHA-induced optimal models outperform the lumped models used in this study in terms of efficiency values while benefitting hydrologists with more meaningful hydrological inferences about the runoff dynamics of the Rappahannock River basin.
Abstract. Despite showing a great success of applications in many commercial fields, machine learning and data science models in general, show a limited use in scientific fields including hydrology. The approach is often criticized for lack of interpretability and physical consistency. This has led to the emergence of new paradigms, such as Theory Guided Data Science (TGDS) and physics informed machine learning. The motivation behind such approaches is to improve the physical meaningfulness of machine learning models by blending existing scientific knowledge with learning algorithms. Following the same principles, in our prior work (Chadalawada et al., 2020), a new model induction framework was founded on Genetic Programming (GP) namely Machine Learning Rainfall-Runoff Model Induction Toolkit (ML-RR-MI). ML-RR-MI is cable of developing fully-fledged lumped conceptual rainfall-runoff models for a watershed of interest using the building blocks of two flexible rainfall-runoff modelling frameworks (FUSE and SUPERFLEX). In this study, we extend ML-RR-MI towards inducing semi-distributed rainfall-runoff models. This effort is motivated by the desire to address the decreasing meaningfulness of lumped models which tend to particularly deteriorate within large catchments where the spatial heterogeneity of forcing variables and watershed properties are significant. Henceforth, our machine learning approach for rainfall-runoff modelling titled Machine Induction Knowledge-Augmented System Hydrologique Asiatique (MIKA-SHA) captures spatial variabilities and automatically induces rainfall-runoff models for the catchment of interest without any subjectivity in model selection. Currently, MIKA-SHA learns models utilizing the model building components of FUSE and SUPERFLEX. However, the proposed framework can be coupled with any internally coherent collection of building blocks. MIKA-SHA’s model induction capabilities have been tested on the Red Creek catchment near Vestry, Mississippi, United States. The resulted model architectures through MIKA-SHA are compatible with previously reported research findings and fieldwork insights of the watershed and are readily interpretable by hydrologists.
Context:Identification of the effect of valid factors on students' academic performance is of great importance to student counseling and policy making. Aims: This study was carried out to fi nd the predictors of academic performance of 2 nd year undergraduate medical students of a renowned Medical College of Kolkata. Materials and Methods: This cross sectional study was carried out in a tertiary care teaching hospital of Kolkata. The information on factors like attendance percentage, sex, place of residence, previous academic performance of the entire batch of 2 nd year students was collected from the departments' academic records and through personal interview. The association of the above mentioned factors with students' academic performance was determined through statistical analysis using t-test and multiple linear regression modeling and the results were reported. Results: Academic performance is found to be weakly correlated with attendance. Better academic grade was observed for the group with high attendance percentage compared to the other with low attendance percentage (P < 0.01). Higher percentage of marks was observed to be scored by female students (P < 0.01), local students (P < 0.01) and high performers who were capable of successfully clearing their 1 st year's coursework in their fi rst attempt (P < 0.01). Conclusion: All the factors studied in this paper which includes attendance, sex, place of residence and previous academic performance serve as predictors in understanding students' performance. Among the above mentioned, the attendance of the students is an important factor that has to be monitored and regulated through corrective actions to improve the performance of the class.
One of the more perplexing challenges for the hydrologic research community is the need for development of coupled systems involving integration of hydrologic, atmospheric and socio-economic relationships. Given the demand for integrated modelling and availability of enormous data with varying degrees of (un)certainty, there exists growing popularity of data-driven, unified theory catchment scale hydrological modelling frameworks. Recent research focuses on representation of distinct hydrological processes using mathematical model components that vary in a controlled manner, thereby deriving relationships between alternative conceptual model constructs and catchments’ behaviour. With increasing computational power, an evolutionary approach to auto-configuration of conceptual hydrological models is gaining importance. Its successful implementation depends on the choice of evolutionary algorithm, inventory of model components, numerical implementation, rules of operation and fitness functions. In this study, genetic programming is used as an example of evolutionary algorithm that employs modelling decisions inspired by the Superflex framework to automatically induce optimal model configurations for the given catchment dataset. The main objective of this paper is to identify the effects of entropy, hydrological and statistical measures as optimization objectives on the performance of the proposed approach based on two synthetic case studies of varying complexity.
The rainfall–runoff process is highly nonlinear, time varying, spatially distributed, and not easily described by simple models. Various models have been developed to simulate this process, including lumped conceptual models, distributed physically based models, and empirical black‐box models. Either conceptual or distributed physically based models require a significant amount of data for calibration and validation, whereas in most cases, it is difficult to collect all the data necessary with sufficient accuracy for such models. Traditional black‐box models such as artificial neural networks provide means to ease the data demands for model calibration and validation; however, the information they provide add little insights for interpretation of the underlying process. Using data‐driven techniques such as genetic programming ( GP ), one can attempt to model the rainfall–runoff process based on available hydrometeorology data. Genetic programming can also be used in combination with conceptual models to discover physically interpretable models or equations describing the physical processes. After a detailed review of the conventional applications of genetic programming in rainfall–runoff modeling, this article introduces a novel scheme of conceptual rainfall–runoff modeling based on GP. A tropical case study is provided as a prototype for illustration.
As new grid edge technologies emerge—such as rooftop solar panels, battery storage, and controllable water heaters—quantifying the uncertainties of building load forecasts is becoming more critical. The recent adoption of smart meter infrastructures provided new granular data streams, largely unavailable just ten years ago, that can be utilized to better forecast building-level demand. This paper uses Bayesian Structural Time Series for probabilistic load forecasting at the residential building level to capture uncertainties in forecasting. We use sub-hourly electrical submeter data from 120 residential apartments in Singapore that were part of a behavioral intervention study. The proposed model addresses several fundamental limitations through its flexibility to handle univariate and multivariate scenarios, perform feature selection, and include either static or dynamic effects, as well as its inherent applicability for measurement and verification. We highlight the benefits of this process in three main application areas: (1) Probabilistic Load Forecasting for Apartment-Level Hourly Loads; (2) Submeter Load Forecasting and Segmentation; (3) Measurement and Verification for Behavioral Demand Response. Results show the model achieves a similar performance to ARIMA, another popular time series model, when predicting individual apartment loads, and superior performance when predicting aggregate loads. Furthermore, we show that the model robustly captures uncertainties in the forecasts while providing interpretable results, indicating the importance of, for example, temperature data in its predictions. Finally, our estimates for a behavioral demand response program indicate that it achieved energy savings; however, the confidence interval provided by the probabilistic model is wide. Overall, this probabilistic forecasting model accurately measures uncertainties in forecasts and provides interpretable results that can support building managers and policymakers with the goal of reducing energy use.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.