Nowadays, in most of the modern database applications, lots of critical queries and tasks cannot be completely addressed by machine. Crowd-sourcing database has become a new paradigm for harness human cognitive abilities to process these computer hard tasks. In particular, those problems that are difficult for machines but easier for humans can be solved better than ever, such as entity resolution, fuzzy matching for predicates and joins, and image recognition. Additionally, crowd-sourcing database allows performing database operators on incomplete data as human workers can be involved to provide estimated values during run-time. Skyline queries which received formidable attention by database community in the last decade, and exploited in a variety of applications such as multi-criteria decision making and decision support systems. Various works have been accomplished address the issues of skyline query in crowd-sourcing database. This includes a database with full and partial complete data. However, we argue that processing skyline queries with partial incomplete data in crowd-sourcing database has not received an appropriate attention. Therefore, an efficient approach processing skyline queries with partial incomplete data in crowd-sourcing database is needed. This paper attempts to present an efficient model tackling the issue of processing skyline queries in incomplete crowd-sourcing database. The main idea of the proposed model is exploiting the available data in the database to estimate the missing values. Besides, the model tries to explore the crowd-sourced database in order to provide more accurate results, when local database failed to provide precise values. In order to ensure high quality result could be obtained, certain factors should be considered for worker selection to carry out the task such as workers quality and the monetary cost. Other critical factors should be considered such as time latency to generate the results.
Data incompleteness becomes a frequent phenomenon in a large number of contemporary database applications such as web autonomous databases, big data, and crowd-sourced databases. Processing skyline queries over incomplete databases impose a number of challenges that negatively influence processing the skyline queries. Most importantly, the skylines derived from incomplete databases are also incomplete in which some values are missing. Retrieving skylines with missing values is undesirable, particularly, for recommendation and decision-making systems. Furthermore, running skyline queries on a database with incomplete data raises a number of issues influence processing skyline queries such as losing the transitivity property of the skyline technique and cyclic dominance between the tuples. The issue of estimating the missing values of skylines has been discussed and examined in the database literature. Most recently, several studies have suggested exploiting the crowd-sourced databases in order to estimate the missing values by generating plausible values using the crowd. Crowd-sourced databases have proved to be a powerful solution to perform user-given tasks by integrating human intelligence and experience to process the tasks. However, task processing using crowd-sourced incurs additional monetary cost and increases the time latency. Also, it is not always possible to produce a satisfactory result that meets the user's preferences. This paper proposes an approach for estimating the missing values of the skylines by first exploiting the available data and utilizes the implicit relationships between the attributes in order to impute the missing values of the skylines. This process aims at reducing the number of values to be estimated using the crowd when local estimation is inappropriate. Intensive experiments on both synthetic and real datasets have been accomplished. The experimental results have proven that the proposed approach for estimating the missing values of the skylines over crowd-sourced enabled incomplete databases is scalable and outperforms the other existing approaches.
Crowd-sourcing is a powerful solution to correctly answer expensive and unanswered queries in the database. This includes queries on a database with uncertain and incomplete data. The crowd-sourcing attempts to exploit human abilities to process these difficult tasks and workers helped to provide accurate results utilizing the available data in the crowd. The crowd-sourcing database systems (CSDB) combined the ability of the crowd with the relational database by using some variant of the relational database with minor changes. This paper surveys and examines the leading studies conducted on the area of query processing for both traditional and preference queries in crowd-sourcing databases. The focus is given on highlights the strengths and the weakness of each approach. A detailed discussion for the current and future trends research relevant to query processing in the area of crow-sourced databases is also demonstrated.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.