2019 IEEE 35th International Conference on Data Engineering (ICDE) 2019
DOI: 10.1109/icde.2019.00056
|View full text |Cite
|
Sign up to set email alerts
|

Assessing and Remedying Coverage for a Given Dataset

Abstract: Data analysis impacts virtually every aspect of our society today. Often, this analysis is performed on an existing dataset, possibly collected through a process that the data scientists had limited control over. The existing data analyzed may not include the complete universe, but it is expected to cover the diversity of items in the universe. Lack of adequate coverage in the dataset can result in undesirable outcomes such as biased decisions and algorithmic racism, as well as creating vulnerabilities such as… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
49
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 75 publications
(51 citation statements)
references
References 18 publications
0
49
0
Order By: Relevance
“…Asudeh et al [12] first proposed a formalisation of the coverage problem. They also present and evaluate methods both to efficiently evaluate the coverage of a dataset with respect to thresholds set by a practitioner for each dataset attribute, and to identify the type of data samples that are preferable to collect to solve the coverage issue accounting for the cost of data collection.…”
Section: Dataset Coverage Characterization and Mitigation Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…Asudeh et al [12] first proposed a formalisation of the coverage problem. They also present and evaluate methods both to efficiently evaluate the coverage of a dataset with respect to thresholds set by a practitioner for each dataset attribute, and to identify the type of data samples that are preferable to collect to solve the coverage issue accounting for the cost of data collection.…”
Section: Dataset Coverage Characterization and Mitigation Methodsmentioning
confidence: 99%
“…Another topic related to the identification of biases within datasets more specific to data management literature is the notion of data coverage. Coverage relates to the idea that data samples in a dataset should sufficiently cover the diversity of items in a universe of discourse [12]. Without adequate coverage, applications using such datasets might be prone to discriminative mistakes.…”
Section: Coveragementioning
confidence: 99%
See 1 more Smart Citation
“…A major challenge is that the available data are often limited [7]. This way, any analysis is done with data that has been acquired independently, through a process on which the data scientist has limited control.…”
Section: Lack Of Datamentioning
confidence: 99%
“…After the data collection process, bias can also be reduced using many other approaches such as data aggregation techniques [19], systematically adding new data points to fix coverage [2], and crowdsourced bias detection [17]. Other approaches to mitigate bias can be employed during feature engineering and model training.…”
Section: Introductionmentioning
confidence: 99%