Abstract. Within the framework of the PAGES NAm2k project, 510 North American borehole temperature-depth profiles were analyzed to infer recent climate changes. To facilitate comparisons and to study the same time period, the profiles were truncated at 300 m. Ground surface temperature histories for the last 500 years were obtained for a model describing temperature changes at the surface for several climate-differentiated regions in North America. The evaluation of the model is done by inversion of temperature perturbations using singular value decomposition and its solutions are assessed using a Monte Carlo approach. The results within 95 % confidence interval suggest a warming between 1.0 and 2.5 K during the last two centuries. A regional analysis, composed of mean temperature changes over the last 500 years and geographical maps of ground surface temperatures, show that all regions experienced warming, but this warming is not spatially uniform and is more marked in northern regions.
Abstract. Differences between paleoclimatic reconstructions are caused by two factors: the method and the input data. While many studies compare methods, we will focus in this study on the consequences of the input data choice in a state-of-the-art Kalman-filter paleoclimate data assimilation approach. We evaluate reconstruction quality in the 20th century based on three collections of tree-ring records: (1) 54 of the best temperature-sensitive tree-ring chronologies chosen by experts; (2) 415 temperature-sensitive tree-ring records chosen less strictly by regional working groups and statistical screening; (3) 2287 tree-ring series that are not screened for climate sensitivity. The three data sets cover the range from small sample size, small spatial coverage and strict screening for temperature sensitivity to large sample size and spatial coverage but no screening. Additionally, we explore a combination of these data sets plus screening methods to improve the reconstruction quality. A large, unscreened collection generally leads to a poor reconstruction skill. A small expert selection of extratropical Northern Hemisphere records allows for a skillful high-latitude temperature reconstruction but cannot be expected to provide information for other regions and other variables. We achieve the best reconstruction skill across all variables and regions by combining all available input data but rejecting records with insignificant climatic information (p value of regression model >0.05) and removing duplicate records. It is important to use a tree-ring proxy system model that includes both major growth limitations, temperature and moisture.
Abstract. Estimates of climate sensitivity from general circulation model (GCM) simulations still present a large spread despite the continued improvements in climate modeling since the 1970s. This variability is partially caused by the dependence of several long-term feedback mechanisms on the reference climate state. Indeed, state-of-the-art GCMs present a large spread of control climate states probably due to the lack of a suitable reference for constraining the climatology of preindustrial simulations. We assemble a new gridded database of long-term ground surface temperatures (LoST database) obtained from geothermal data over North America, and we explore its use as a potential reference for the evaluation of GCM preindustrial simulations. We compare the LoST database with observations from the Climate Research Unit (CRU) database, as well as with five past millennium transient climate simulations and five preindustrial control simulations from the third phase of the Paleoclimate Modelling Intercomparison Project (PMIP3) and the fifth phase of the Coupled Model Intercomparison Project (CMIP5). The database is consistent with meteorological observations as well as with both types of preindustrial simulations, which suggests that LoST temperatures can be employed as a reference to narrow down the spread of surface temperature climatologies on GCM preindustrial control and past millennium simulations.
A recent trend in health-related machine learning proposes the use of Graph Neural Networks (GNN's) to model biomedical data. This is justified due to the complexity of healthcare data and the modelling power of graph abstractions. Thus, GNN's emerge as the natural choice to learn from increasing amounts of healthcare data. While formulating the problem, however, there are usually multiple design choices and decisions that can affect the final performance. In this work, we focus on Clinical Trial (CT) protocols consisting of hierarchical documents, containing free text as well as medical codes and terms, and design a classifier to predict each CT protocol termination risk as "low" or "high". We show that while using GNN's to solve this classification task is very successful, the way the graph is constructed is also of importance and one can benefit from making a priori useful information more explicit. While a natural choice is to consider each CT protocol as an independent graph and pose the problem as a graph classification, consistent performance improvements can be achieved by considering them as super-nodes in one unified graph and connecting them according to some metadata, like similar medical condition or intervention, and finally approaching the problem as a node classification task rather than graph classification. We validate this hypothesis experimentally on a large-scale manually labeled CT database. This provides useful insights on the flexibility of graphbased modeling for machine learning in the healthcare domain.
The prediction of chemical reaction pathways has been accelerated by the development of novel machine learning architectures based on the deep learning paradigm. In this context, deep neural networks initially designed for language translation have been used to accurately predict a wide range of chemical reactions. Among models suited for the task of language translation, the recently introduced molecular transformer reached impressive performance in terms of forward-synthesis and retrosynthesis predictions. In this study, we first present an analysis of the performance of transformer models for product, reactant, and reagent prediction tasks under different scenarios of data availability and data augmentation. We find that the impact of data augmentation depends on the prediction task and on the metric used to evaluate the model performance. Second, we probe the contribution of different combinations of input formats, tokenization schemes, and embedding strategies to model performance. We find that less stable input settings generally lead to better performance. Lastly, we validate the superiority of round-trip accuracy over simpler evaluation metrics, such as top-k accuracy, using a committee of human experts and show a strong agreement for predictions that pass the round-trip test. This demonstrates the usefulness of more elaborate metrics in complex predictive scenarios and highlights the limitations of direct comparisons to a predefined database, which may include a limited number of chemical reaction pathways.
Abstract. Although collaborative efforts have been made to retrieve climate data from instrumental observations and paleoclimate records, there is still a large amount of valuable information in historical archives that has not been utilized for climate reconstruction. Due to the qualitative nature of these datasets, historical texts have been compiled and studied by historians aiming to describe the climate impact in socio-economical aspects of human societies, but the inclusion of this information in past climate reconstructions remains fairly unexplored. Within this context, we present a novel approach to assimilate climate information contained in chronicles and annals from the 15th century to generate robust temperature and precipitation reconstructions of the Burgundian Low Countries, taking into account uncertainties associated with the descriptions of narrative sources. After data assimilation, our reconstructions present a high seasonal temperature correlation of ∼0.8 independently of the climate model employed to estimate the background state of the atmosphere. Our study aims to be a first step towards a more quantitative use of available information contained in historical texts, showing how Bayesian inference can help the climate community with this endeavour.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.