Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2022
DOI: 10.1145/3534678.3542604
|View full text |Cite
|
Sign up to set email alerts
|

Advances in Exploratory Data Analysis, Visualisation and Quality for Data Centric AI Systems

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 9 publications
(6 citation statements)
references
References 14 publications
0
4
0
Order By: Relevance
“…The use of two case studies based on an EDA approach to data quality motivates a collection of research questions for statistics that cover theory, methodology, and software tools. Data visualization is a crucial EDA approach that uses visual elements like charts and graphs to make analysis simple and efficient [24]. When it comes to data quality profiling, visual EDA is very pertinent.…”
Section: Execution and Analysismentioning
confidence: 99%
“…The use of two case studies based on an EDA approach to data quality motivates a collection of research questions for statistics that cover theory, methodology, and software tools. Data visualization is a crucial EDA approach that uses visual elements like charts and graphs to make analysis simple and efficient [24]. When it comes to data quality profiling, visual EDA is very pertinent.…”
Section: Execution and Analysismentioning
confidence: 99%
“…The lack of digital data in Greece presents a golden opportunity to start from scratch and potentially create data-centric AI systems that prioritise data quality over quantity based on a set of data that is scalable, adaptable, and governable (Patel et al, 2022). Technologically advanced and developed economies, such as the United States, Germany, Canada, and the United Kingdom, have achieved the digitalization of health and the collection of RWD, but are now facing significant challenges in transforming their systems and making them time-efficient and accessible to AI systems.…”
Section: Golden Opportunity To Start From Scratchmentioning
confidence: 99%
“…These algorithms are typically evaluated on the same task for which the dataset was collected, and the learned policy can be pessimistic in out-of-distribution states and actions, leading to poor generalization in unseen downstream tasks. Recently, data-centric approaches have become emerging, emphasizing the importance of training data quality over algorithmic advances (Motamedi, Sakharnykh, and Kaldewey 2021;Patel et al 2022). To improve training data quality, researchers have explored selecting the most critical samples or re-weighting (Wu et al 2021) all samples in the offline RL algorithms.…”
Section: Introductionmentioning
confidence: 99%