2020 International Conference on Data Mining Workshops (ICDMW) 2020
DOI: 10.1109/icdmw51313.2020.00082
|View full text |Cite
|
Sign up to set email alerts
|

SynC: A Copula based Framework for Generating Synthetic Data from Aggregated Sources

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 22 publications
(15 citation statements)
references
References 22 publications
0
15
0
Order By: Relevance
“…Classical approaches for the data generation task include Copulas (Patki et al, 2016;Li et al, 2020b) and Bayesian Networks (Zhang et al, 2017b), where for the latter those based on the approximation proposed by Chow and Liu (1968) are especially popular. In the deep-learning era, Generative Adversarial Networks (GANs) (Goodfellow et al, 2014) have proven highly successful for the generation of images (Radford et al, 2016;Karras et al, 2020).…”
Section: Methodsmentioning
confidence: 99%
“…Classical approaches for the data generation task include Copulas (Patki et al, 2016;Li et al, 2020b) and Bayesian Networks (Zhang et al, 2017b), where for the latter those based on the approximation proposed by Chow and Liu (1968) are especially popular. In the deep-learning era, Generative Adversarial Networks (GANs) (Goodfellow et al, 2014) have proven highly successful for the generation of images (Radford et al, 2016;Karras et al, 2020).…”
Section: Methodsmentioning
confidence: 99%
“…Synthetic data -artificially generated data that mimic the original (observed) data by preserving relationships between variables (Nowok et al, 2016) -may be useful in several areas such as healthcare, finance, data science, and machine learning (Dahmen & Cook, 2019;Kamthe et al, 2021;Nowok et al, 2016;Patki et al, 2016). As such, copula-based data generation models -probabilistic models that allow for the statistical properties of observed data to be modelled in terms of individual behavior and (inter-)dependencies (Joe, 2014) -have shown potential in several applications such as finance, data science, and meteorology (Kamthe et al, 2021;Li et al, 2020;Meyer, Nagler, et al, 2021;Patki et al, 2016). Although copula-based data generation tools have been developed for tabular data -e.g.…”
Section: Summary and Statement Of Needmentioning
confidence: 99%
“…Although copula-based data generation tools have been developed for tabular data -e.g. the Synthetic Data Vault project using Gaussian copulas and generative adversarial networks (Patki et al, 2016;Xu & Veeramachaneni, 2018), or the Synthetic Data Generation via Gaussian Copula (Li et al, 2020) -in computational sciences such as weather and climate, data often consist of large, labelled multidimensional datasets with complex dependencies.…”
Section: Summary and Statement Of Needmentioning
confidence: 99%
“…This is an efficient algorithm that scales well in high dimensional settings. Unlike proximity based models that require pairwise distance calculation [38,26] or learning based models that require training, COPOD incurs low computational overhead.…”
Section: Probabilistic Based Outlier Detection Techniquesmentioning
confidence: 99%