Open and decentralized technologies such as the Internet provide increasing opportunities to create knowledge and deliver computer-based decision support for multiple types of users across scales. However, environmental decision support systems/tools (henceforth EDSS) are often strongly science-driven and assuming single types of decision makers, and hence poorly suited for more decentralized and polycentric decision making contexts. In such contexts, EDSS need to be tailored to meet diverse user requirements to ensure that it provides useful (relevant), usable (intuitive), and exchangeable (institutionally unobstructed) information for decision support for different types of actors. To address these issues, we present a participatory framework for designing EDSS that emphasizes a more complete understanding of the decision making structures and iterative design of the user interface. We illustrate the application of the framework through a case study within the context of water-stressed upstream/downstream communities in Lima, Peru
Forest fires are an integral part of the natural Earth system dynamics, however they are becoming more devastating and less predictable as anthropogenic climate change exacerbates their impacts. In order to advance fire science, fire danger reanalysis products can be used as proxy for fire weather observations with the advantage of being homogeneously distributed both in space and time. This manuscript describes a reanalysis dataset of fire danger indices based on the Canadian Fire Weather Index system and the ECMWF ERA5 reanalysis dataset, which supersedes the previous dataset based on ERA-Interim. The new fire danger reanalysis dataset provides a number of benefits compared to the one based on ERA-Interim: it relies on better estimates of precipitation, evaporation and soil moisture, it is available in a deterministic form as well as a probabilistic ensemble and it is characterised by a considerably higher spatial resolution. It is a valuable resource for forestry agencies and scientists in the field of wildfire danger modeling and beyond. The global dataset is produced by ECMWF, as the computational centre of the European Forest Fire information System (EFFIS) of the Copernicus Emergency Management Service, and it is made available free of charge through the Climate Data Store.
Learning the structure of Bayesian networks from data is known to be a computationally challenging, NP-hard problem. The literature has long investigated how to perform structure learning from data containing large numbers of variables, following a general interest in high-dimensional applications ("small n, large p") in systems biology and genetics.More recently, data sets with large numbers of observations (the so-called "big data") have become increasingly common; and these data sets are not necessarily high-dimensional, sometimes having only a few tens of variables depending on the application. We revisit the computational complexity of Bayesian network structure learning in this setting, showing that the common choice of measuring it with the number of estimated local distributions leads to unrealistic time complexity estimates for the most common class of scorebased algorithms, greedy search. We then derive more accurate expressions under common distributional assumptions. These expressions suggest that the speed of Bayesian network learning can be improved by taking advantage of the availability of closed form estimators for local distributions with few parents. Furthermore, we find that using predictive instead of insample goodness-of-fit scores improves speed; and we confirm that is improves the accuracy of network recon-
Abstract. The open-source programming language R has gained a central place in the hydrological sciences over the last decade, driven by the availability of diverse hydro-meteorological data archives and the development of open-source computational tools.
The growth of R's usage in hydrology is reflected in the number of newly published hydrological packages, the strengthening of online user communities, and the popularity of training courses and events.
In this paper, we explore the benefits and advantages of R's usage in hydrology, such as the democratization of data science and numerical literacy, the enhancement of reproducible research and open science, the access to statistical tools, the ease of connecting R to and from other languages, and the support provided by a growing community.
This paper provides an overview of a typical hydrological workflow based on reproducible principles and packages for retrieval of hydro-meteorological data, spatial analysis, hydrological modelling, statistics, and the design of static and dynamic visualizations and documents.
We discuss some of the challenges that arise when using R in hydrology and useful tools to overcome them, including the use of hydrological libraries, documentation, and vignettes (long-form guides that illustrate how to use packages); the role of integrated development environments (IDEs); and the challenges of big data and parallel computing in hydrology.
Lastly, this paper provides a roadmap for R's future within hydrology, with R packages as a driver of progress in the hydrological sciences, application programming interfaces (APIs) providing new avenues for data acquisition and provision, enhanced teaching of hydrology in R, and the continued growth of the community via short courses and events.
The link between pollution and health is commonly explored by trying to identify the dominant cause of pollution and its most significant effect on health outcomes. The use of multivariate features to describe exposure is less explored because investigating a large domain of scenarios is theoretically (i.e., interpretation of results) and technically (i.e., computational effort) challenging. In this work we explore the use of Bayesian Networks with a multivariate approach to identify the probabilistic dependence structure of the environment‐health nexus. This consists of environmental factors (topography and climate), exposure levels (concentration of outdoor air pollutants), and health outcomes (mortality rates). The information is collated with regard to a data‐rich study area: the English regions (UK), which incorporate environmental types that are different in character from urban to rural. We implemented a reproducible workflow in the R programming language to collate environment‐health data and analyze almost 50 millions of observations making use of a graphical model (Bayesian Network) and Big Data technologies. Results show that for pollution and weather variables the model tests well in sample but also has good predictive power when tested out of sample. This is facilitated by a training/testing split in the data along time and space dimension and suggests that the model generalizes well to new regions and time periods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.