We developed three models of daily human- and lightning-caused fire occurrence to support fire management preparedness and detection planning in the province of British Columbia, Canada, using a lasso-logistic framework. Novel aspects of our work involve (1) using an ensemble of models that were created using 500 datasets balanced (through response-selective sampling) to have equal numbers of fire and non-fire observations; (2) the use of a new ranking algorithm to address the difficulty in interpreting variable importance in models with a large number of covariates. We also introduce the use of cause-specific average spatial daily fire occurrence, termed baseline risk, as a covariate for missing or poorly estimated factors that influence human and lightning fire occurrence. All three models have strong predictive ability, with areas under the Receiver Operator Characteristic curve exceeding 0.9.
Background The tonsil of the soft palate in pigs is the colonization site of both commensal and pathogenic microbial agents. Streptococcus suis infections are a significant economic problem in the swine industry. The development of S. suis disease remains poorly understood. The purpose of this study was to identify whether the tonsillar microbiota profile in nursery pigs is altered with S. suis disease. Here, the dynamics of the tonsillar microbiota from 20 healthy pigs and 43 diseased pigs with S. suis clinical signs was characterized. Results Based on the presence or absence of S. suis in the systemic sites, diseased pigs were classified into confirmed (n = 20) or probable (n = 23) group, respectively. Microbiota composition was assessed using the V3-V4 hypervariable region of the 16S rRNA, and results were analyzed to identify the diversity of the tonsillar microbiota. The taxonomic composition of the tonsil microbiota proved to be highly diverse between individuals, and the results showed statistically significant microbial community structure among the diagnosis groups. The confirmed group had the lowest observed species richness while the probable group had higher phylogenetics diversity level compared to the healthy group. Un-weighted Unifrac also demonstrated that the probable group had a higher beta diversity than both the healthy and the confirmed group. A Dirichlet-multinomial mixture (DMM) model-based clustering method partitioned the tonsil microbiota into two distinct community types that did not correspond with disease status. However, there was an association between Streptococcus suis serotype 2 and DMM community type 1 (p = 0.03). ANCOM-BC identified 24 Streptococcus amplicon sequence variants (ASVs) that were differentially abundant between the DMM community types. Conclusions This study provides a comprehensive analysis of the structure and membership of the tonsil microbiota in nursery pigs and uncovers differences and similarities across varying S. suis disease status. While the overall abundance of Streptococcus was not different among the diagnosis groups, the unique profile of DMM community type 1 and the observed correlation with S. suis serotype 2 could provide insight into potential tonsillar microbiota involvement in S. suis disease.
Climate change will create warmer temperatures, greater precipitation, and longer growing seasons in northern latitudes making agriculture increasingly possible in boreal regions. To assess the potential of any such expansion, this paper provides a first-order approximation of how much land could become suitable for four staple crops (corn, potato, soy, and wheat) in Canada by 2080. In addition, we estimate how the environmental trade-offs of northern agricultural expansion will impact critical ecosystem services. Primarily, we evaluate how the regulatory ecosystem services of carbon storage and sequestration and the habitat services supporting biodiversity would be traded for the provisioning services of food production. Here we show that under climate change projected by Canadian Earth System Model (CanESM2) Representative Concentration Pathway 4.5, ∼1.85 million km2 of land may become suitable for farming in Canada’s North, which, if utilized, would lead to the release of ∼15 gigatonnes of carbon if all forests and wetlands are cleared and plowed. These land-use changes would also have profound implications for Indigenous sovereignty and the governance of protected and conserved areas in Canada. These results highlight that research is urgently needed so that stakeholders can become aware of the scope of potential economic opportunities, cultural issues, and environmental trade-offs required for agricultural sustainability in Canada.
Land suitability models for Canada are currently based on single-crop inventories and expert opinion. We present a data-driven multi-layer perceptron that simultaneously predicts the land suitability of several crops in Canada, including barley, peas, spring wheat, canola, oats, and soy. Available crop yields from 2013–2020 are downscaled to the farm level by masking the district level crop yield data to focus only on areas where crops are cultivated and leveraging soil-climate-landscape variables obtained from Google Earth Engine for crop yield prediction. This new semi-supervised learning approach can accommodate data from different spatial resolutions and enables training with unlabelled data. The incorporation of a crop indicator function further allows for the training of a multi-crop model that can capture the interdependences and correlations between various crops, thereby leading to more accurate predictions. Through k-fold cross-validation, we show that compared to the single crop models, our multi-crop model could produce up to a 2.82 fold reduction in mean absolute error for any particular crop. We found that barley, oats, and mixed grains were more tolerant to soil-climate-landscape variations and could be grown in many regions of Canada, while non-grain crops were more sensitive to environmental factors. Predicted crop suitability was associated with a region’s growing season length, which supports climate change projections that regions of northern Canada will become more suitable for agricultural use. The proposed multi-crop model could facilitate assessment of the suitability of northern lands for crop cultivation and be incorporated into cost-benefit analyses.
We develop a novel covariate ranking and selection algorithm for regularized ordinary logistic regression (OLR) models in the presence of severe class-imbalance in high dimensional datasets with correlated signal and noise covariates. Class-imbalance is resolved using response-based subsampling which we also employ to achieve stability in variable selection by creating an ensemble of regularized OLR models fitted to subsampled (and balanced) datasets. The regularization methods considered in our study include Lasso, adaptive Lasso (adaLasso) and ridge regression. Our methodology is versatile in the sense that it works effectively for regularization techniques involving both hard- (e.g. Lasso) and soft-shrinkage (e.g. ridge) of the regression coefficients. We assess selection performance by conducting a detailed simulation experiment involving varying moderate-to-severe class-imbalance ratios and highly correlated continuous and discrete signal and noise covariates. Simulation results show that our algorithm is robust against severe class-imbalance under the presence of highly correlated covariates, and consistently achieves stable and accurate variable selection with very low false discovery rate. We illustrate our methodology using a case study involving a severely imbalanced high-dimensional wildland fire occurrence dataset comprising 13 million instances. The case study and simulation results demonstrate that our framework provides a robust approach to variable selection in severely imbalanced big binary data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.