SummaryCrowdsourcing methods facilitate the production of scientific information by non‐experts. This form of citizen science (CS) is becoming a key source of complementary data in many fields to inform data‐driven decisions and study challenging problems. However, concerns about the validity of these data often constrain their utility. In this paper, we focus on the use of citizen science data in addressing complex challenges in environmental conservation. We consider this issue from three perspectives. First, we present a literature scan of papers that have employed Bayesian models with citizen science in ecology. Second, we compare several popular majority vote algorithms and introduce a Bayesian item response model that estimates and accounts for participants' abilities after adjusting for the difficulty of the images they have classified. The model also enables participants to be clustered into groups based on ability. Third, we apply the model in a case study involving the classification of corals from underwater images from the Great Barrier Reef, Australia. We show that the model achieved superior results in general and, for difficult tasks, a weighted consensus method that uses only groups of experts and experienced participants produced better performance measures. Moreover, we found that participants learn as they have more classification opportunities, which substantially increases their abilities over time. Overall, the paper demonstrates the feasibility of CS for answering complex and challenging ecological questions when these data are appropriately analysed. This serves as motivation for future work to increase the efficacy and trustworthiness of this emerging source of data.
Building on a strong foundation of philosophy, theory, methods and computation over the past three decades, Bayesian approaches are now an integral part of the toolkit for most statisticians and data scientists. Whether they are dedicated Bayesians or opportunistic users, applied professionals can now reap many of the benefits afforded by the Bayesian paradigm. In this paper, we touch on six modern opportunities and challenges in applied Bayesian statistics: intelligent data collection, new data sources, federated analysis, inference for implicit models, model transfer and purposeful software products. This article is part of the theme issue ‘Bayesian inference: challenges, perspectives, and prospects’.
Background The health and development of children during their first year of full time school is known to impact their social, emotional, and academic capabilities throughout and beyond early education. Physical health, motor development, social and emotional well-being, learning styles, language and communication, cognitive skills, and general knowledge are all considered to be important aspects of a child’s health and development. It is important for many organisations and governmental agencies to continually improve their understanding of the factors which determine or influence development vulnerabilities among children. This article studies the relationships between development vulnerabilities and educational factors among children in Queensland, Australia. Methods Spatial statistical machine learning models are reviewed and compared in the context of a study of geographic variation in the association between development vulnerabilities and attendance at preschool among children in Queensland, Australia. A new spatial random forest (SRF) model is suggested that can explain more of the spatial variation in data than other approaches. Results In the case study, spatial models were shown to provide a better fit compared to models that ignored the spatial variation in the data. The SRF model was shown to be the only model which can explain all of the spatial variation in each of the development vulnerabilities considered in the case study. The spatial analysis revealed that the attendance at preschool factor has a strong influence on the physical health domain vulnerability and emotional maturity vulnerability among children in their first year of school. Conclusion This study confirmed that it is important to take into account the spatial nature of data when fitting statistical machine learning models. A new spatial random forest model was introduced and was shown to explain more of the spatial variation and provide a better model fit in the case study of development vulnerabilities among children in Queensland. At small-area population level, increased attendance at preschool was strongly associated with reduced physical and emotional development vulnerabilities among children in their first year of school.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.