Crowdsourcing for Query Processing onWeb Data: A Case Study on the Skyline Operator

Maarry, Kinda El; Lofi, Christoph; Balke, Wolf‐Tilo

doi:10.2498/cit.1002509

Cited by 7 publications

(16 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The crowd-sourcing strategy is based on incorporating human workers to attain improved results, while the advanced heuristic offers an alternative offline solution for times when crowd-sourcing may not be a feasible option, for example when the missing data are not easily available for the crowd or the costs of crowd-sourcing are prohibitive. We conclude that the approach introduced in [4] has failed to generate an accurate value when the missing rate is very high. Besides, the approach incurred high time latency and monetary cost when estimating the missing values from the crowd.…”

Section: Introductionmentioning

confidence: 91%

“…However, the correctness of the skyline might be deteriorated when relying on heuristics rules to identify the skylines. It may be that certain dominated tuples are included in the skyline results (false positive) and/or certain tuples which should be included in the skyline results are omitted (false negative) [4]. Furthermore, it is most likely that the relative error between the actual and the estimated values become very high if heuristics rules cannot capture the semantic relationship between the attributes; also if the missing rate in the database is high as it impacts the quality of the estimated values.…”

Section: B Skyline Queries On Crowd-sourced-enabled Incomplete Databmentioning

confidence: 99%

“…These activities negatively influence the database contents and deteriorate their quality [1], [2], [3]. These factors impact on the completeness and the correctness of the query result [4], [5], [6], [7]. Some queries cannot be optimally answered through traditional database management techniques as the process of answering certain queries relies on information that is incomplete, imprecise, or uncertain.…”

Section: Introductionmentioning

confidence: 99%

“…In 2013 the work in [6] proposed a new approach aiming to overcome the limitation of the previous work by proposing a different heuristic approach than KNN. The new approach adopts a model named minimum value model and has been further extended in 2015 [4] by adding the two new strategies of crowdsourcing and advanced heuristics. The crowd-sourcing strategy is based on incorporating human workers to attain improved results, while the advanced heuristic offers an alternative offline solution for times when crowd-sourcing may not be a feasible option, for example when the missing data are not easily available for the crowd or the costs of crowd-sourcing are prohibitive.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Skyline Queries Computation on Crowdsourced- Enabled Incomplete Database

et al. 2020

View full text Add to dashboard Cite

Data incompleteness becomes a frequent phenomenon in a large number of contemporary database applications such as web autonomous databases, big data, and crowd-sourced databases. Processing skyline queries over incomplete databases impose a number of challenges that negatively influence processing the skyline queries. Most importantly, the skylines derived from incomplete databases are also incomplete in which some values are missing. Retrieving skylines with missing values is undesirable, particularly, for recommendation and decision-making systems. Furthermore, running skyline queries on a database with incomplete data raises a number of issues influence processing skyline queries such as losing the transitivity property of the skyline technique and cyclic dominance between the tuples. The issue of estimating the missing values of skylines has been discussed and examined in the database literature. Most recently, several studies have suggested exploiting the crowd-sourced databases in order to estimate the missing values by generating plausible values using the crowd. Crowd-sourced databases have proved to be a powerful solution to perform user-given tasks by integrating human intelligence and experience to process the tasks. However, task processing using crowd-sourced incurs additional monetary cost and increases the time latency. Also, it is not always possible to produce a satisfactory result that meets the user's preferences. This paper proposes an approach for estimating the missing values of the skylines by first exploiting the available data and utilizes the implicit relationships between the attributes in order to impute the missing values of the skylines. This process aims at reducing the number of values to be estimated using the crowd when local estimation is inappropriate. Intensive experiments on both synthetic and real datasets have been accomplished. The experimental results have proven that the proposed approach for estimating the missing values of the skylines over crowd-sourced enabled incomplete databases is scalable and outperforms the other existing approaches.

show abstract

Section: Introductionmentioning

confidence: 91%

Section: B Skyline Queries On Crowd-sourced-enabled Incomplete Databmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Skyline Queries Computation on Crowdsourced- Enabled Incomplete Database

et al. 2020

View full text Add to dashboard Cite

show abstract

“…For Pareto models, PDP and PCP are not mutually expressive. While Pareto orders are widely studied in fields like voting theory [10] (unanimity), allocation problems [1] (Pareto optimality), decision making, database queries [2,7] (skyline operator) and economics (Pareto efficiency), there exists no general study of PDP or PCP based on Pareto orders so far. Pareto orders give a natural way of comparing alternatives; one alternative is better than another if it is better on all relevant evaluation functions (different criteria by which the alternatives can be evaluated).…”

Section: Introductionmentioning

confidence: 99%