Over the last decade, hundreds of thousands of volunteers have contributed to science by collecting or analyzing data. This public participation in science, also known as citizen science, has contributed to significant discoveries and led to publications in major scientific journals. However, little attention has been paid to data quality issues. In this work we argue that being able to determine the accuracy of data obtained by crowdsourcing is a fundamental question and we point out that, for many real-life scenarios, mathematical tools and processes for the evaluation of data quality are missing. We propose a probabilistic methodology for the evaluation of the accuracy of labeling data obtained by crowdsourcing in citizen science. The methodology builds on an abstract probabilistic graphical model formalism, which is shown to generalize some already existing label aggregation models. We show how to make practical use of the methodology through a comparison of data obtained from different citizen science communities analyzing the earthquake that took place in Albania in 2019.
When AI technologies are applied to real-world problems, it is often difficult for developers to anticipate all the knowledge needed. Previous research has shown that introspective reasoning can be a useful tool for helping to address this problem in case-based reasoning systems, by enabling them to augment their routine learning of cases with learning to make better use of their cases, as problem-solving experience reveals deficiencies in their reasoning process. In this paper we present a new introspective model for autonomously improving the performance of a CBR system by reasoning about system problem solving failures. We illustrate its benefits with experimental results from tests in an industrial design application.
Being able to predict the performance of a Case-Based Reasoning system against a set of future problems would provide invaluable information for design and maintenance of the system. Thus, we could carry out the needed design changes and maintenance tasks to improve future performance in a proactive fashion. This paper proposes a novel method for identifying regions in a case base where the system gives low confidence solutions to possible future problems. Experimentation is provided for RoboSoccer domain and we argue how encountered regions of dubiosity help us to analyse the case base and the reasoning mechanisms of the given Case-Based Reasoning system.
In the early stages of an emergency, information extracted from social media can support crisis response with evidence-based content. In order to capture this evidence, the events of interest must be first promptly detected. An automated detection system is able to activate other tasks, such as preemptive data processing for extracting eventrelated information. In this paper, we extend the human-in-the-loop approach in our previous work, TriggerCit, with a machine-learning-based event detection system trained on word count time series and coupled with an automated lexicon building algorithm. We design this framework in a language-agnostic fashion. In this way, the system can be deployed to any language without substantial effort. We evaluate the capacity of the proposed work against authoritative flood data for Nepal recorded over two years.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.