Crowdsourcing is increasingly being used as a means to tackle problems requiring human intelligence. With the evergrowing worker base that aims to complete microtasks on crowdsourcing platforms in exchange for financial gains, there is a need for stringent mechanisms to prevent exploitation of deployed tasks. Quality control mechanisms need to accommodate a diverse pool of workers, exhibiting a wide range of behavior. A pivotal step towards fraud-proof task design is understanding the behavioral patterns of microtask workers. In this paper, we analyze the prevalent malicious activity on crowdsourcing platforms and study the behavior exhibited by trustworthy and untrustworthy workers, particularly on crowdsourced surveys. Based on our analysis of the typical malicious activity, we define and identify different types of workers in the crowd, propose a method to measure malicious activity, and finally present guidelines for the efficient design of crowdsourced surveys.
Current global environmental issues raise unavoidable challenges for our use of natural resources. Supplying the human population with clean water is becoming a global problem. Numerous organic and inorganic impurities in municipal, industrial, and agricultural waters, ranging from microplastics to high nutrient loads and heavy metals, endanger our nutrition and health. The development of efficient wastewater treatment technologies and circular economic approaches is thus becoming increasingly important. The biomass production of microalgae using industrial wastewater offers the possibility of recycling industrial residues to create new sources of raw materials for energy and material use. This review discusses algae‐based wastewater treatment technologies with a special focus on industrial wastewater sources, the potential of non‐conventional extremophilic (thermophilic, acidophilic, and psychrophilic) microalgae, and industrial algae‐wastewater treatment concepts that have already been put into practice.
PurposeResearch in the area of technology‐enhanced learning (TEL) throughout the last decade has largely focused on sharing and reusing educational resources and data. This effort has led to a fragmented landscape of competing metadata schemas, or interface mechanisms. More recently, semantic technologies were taken into account to improve interoperability. The linked data approach has emerged as the de facto standard for sharing data on the web. To this end, it is obvious that the application of linked data principles offers a large potential to solve interoperability issues in the field of TEL. This paper aims to address this issue.Design/methodology/approachIn this paper, approaches are surveyed that are aimed towards a vision of linked education, i.e. education which exploits educational web data. It particularly considers the exploitation of the wealth of already existing TEL data on the web by allowing its exposure as linked data and by taking into account automated enrichment and interlinking techniques to provide rich and well‐interlinked data for the educational domain.FindingsSo far web‐scale integration of educational resources is not facilitated, mainly due to the lack of take‐up of shared principles, datasets and schemas. However, linked data principles increasingly are recognized by the TEL community. The paper provides a structured assessment and classification of existing challenges and approaches, serving as potential guideline for researchers and practitioners in the field.Originality/valueBeing one of the first comprehensive surveys on the topic of linked data for education, the paper has the potential to become a widely recognized reference publication in the area.
Web search is frequently used by people to acquire new knowledge and to satisfy learning-related objectives. In this context, informational search missions with an intention to obtain knowledge pertaining to a topic are prominent. The importance of learning as an outcome of web search has been recognized. Yet, there is a lack of understanding of the impact of web search on a user's knowledge state. Predicting the knowledge gain of users can be an important step forward if web search engines that are currently optimized for relevance can be molded to serve learning outcomes. In this paper, we introduce a supervised model to predict a user's knowledge state and knowledge gain from features captured during the search sessions. To measure and predict the knowledge gain of users in informational search sessions, we recruited 468 distinct users using crowdsourcing and orchestrated real-world search sessions spanning 11 different topics and information needs. By using scientifically formulated knowledge tests, we calibrated the knowledge of users before and after their search sessions, quantifying their knowledge gain. Our supervised models utilise and derive a comprehensive set of features from the current state of the art and compare performance of a range of feature sets and feature selection strategies. Through our results, we demonstrate the ability to predict and classify the knowledge state and gain using features obtained during search sessions, exhibiting superior performance to an existing baseline in the knowledge state prediction task.
Nowadays, a substantial number of people are turning to crowdsourcing, in order to resolve tasks that require human intervention. Despite a considerable amount of research done in the field of crowdsourcing, existing works fall short when it comes to classifying typically crowdsourced tasks. Understanding the dynamics of the tasks that are crowdsourced and the behaviour of workers, plays a vital role in efficient task-design. In this paper, we propose a two-level categorization scheme for tasks, based on an extensive study of 1000 workers on CrowdFlower. In addition, we present insights into certain aspects of crowd behaviour; the task affinity of workers, effort exerted by workers to complete tasks of various types, and their satisfaction with the monetary incentives.
The Web of Data, and in particular Linked Data, has seen tremendous growth over the past years. However, reuse and take-up of these rich data sources is often limited and focused on a few well-known and established RDF datasets. This can be partially attributed to the lack of reliable and up-to-date information about the characteristics of available datasets. While RDF datasets vary heavily with respect to the features related to quality, coverage, dynamics and currency, reliable information about such features is essential to enable dataset discovery in tasks such as entity linking, distributed query, search or question answering. Even though there exists a wealth of works contributing to the problem of dataset profiling in general, these works are spread across a wide range of communities. In this survey, we provide a first comprehensive survey of the RDF dataset profile features, methods, tools and vocabularies. We organize these building blocks of dataset profiling in a taxonomy and emphasize the links between the dataset profiling and feature extraction approaches and several application domains. The survey is aimed towards data practitioners, data providers and scientists, spanning a large range of communities and drawing from different fields such as dataset profiling, assessment, summarization and characterization. Ultimately, this work is intended to facilitate the reader to identify and locate the relevant features for building a dataset profile for intended applications together with the tools capable of extracting these features from the data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.