Proceedings of the 6th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation 2019
DOI: 10.1145/3360322.3361016
|View full text |Cite
|
Sign up to set email alerts
|

Towards Class-Balancing Human Comfort Datasets with GANs

Abstract: Human comfort datasets are widely used in smart buildings. From thermal comfort prediction to personalized indoor environments, labelled subjective responses from participants in an experiment are required to feed different machine learning models. However, many of these datasets are small in samples per participants, number of participants, or suffer from a class-imbalance of its subjective responses. In this work we explore the use of Generative Adversarial Networks to generate synthetic samples to be used i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 14 publications
(9 citation statements)
references
References 6 publications
0
9
0
Order By: Relevance
“…rebalancing (Engelmann and Lessmann, 2021;Quintana and Miller, 2019;Koivu et al, 2020;Darabi and Elor, 2021) in particular. Another highly relevant topic is privacy-aware machine learning (Choi et al, 2017;Fan et al, 2020;Kamthe et al, 2021) where generated data can be used to overcome privacy concerns.…”
Section: Tabular Data Generationmentioning
confidence: 99%
“…rebalancing (Engelmann and Lessmann, 2021;Quintana and Miller, 2019;Koivu et al, 2020;Darabi and Elor, 2021) in particular. Another highly relevant topic is privacy-aware machine learning (Choi et al, 2017;Fan et al, 2020;Kamthe et al, 2021) where generated data can be used to overcome privacy concerns.…”
Section: Tabular Data Generationmentioning
confidence: 99%
“…Recently, Engelmann and Lessmann [54] also examined the ability of GANs to generate data in a structured (tabular) rather than unstructured (image) context, specifically in the field of credit scoring. Like Quintana and Miller [55], these researchers sought to generate and use both continuous and categorical explanatory variables. The authors opted for a Wasserstein GAN [56] architecture, with adjustments such as using the Gumbel-softmax activation function [57] in combination with embedding layers [58] to model discrete numerical variables, and min-max scaling paired with the addition of Gaussian noise data to avoid Discriminator detection of a trivial pattern ("number of loyalty points", for example, which in the real dataset only appears in increments of ten).…”
Section: Financial Transactionsmentioning
confidence: 99%
“…Another mostly unexplored application of GANs in imbalanced data settings in the realm of tabular datasets comes relating to human sentiment. Quintana and Miller [55] attempt to remedy class imbalance in a human comfort dataset [62], a dataset inquiring of participants the satisfaction with their living environments which contains a sizeable majority of "0" (neutral) labels. The research duo examines the performance of the Tabular-GAN framework developed by Xu and Veeramachaneni [63], as well as no treatment, GANCorr, and a basic GAN as baselines, all with a 70-30 train-test split and with KNN, Naïve Bayesian, and SVM learners.…”
Section: Other Disciplinesmentioning
confidence: 99%
“…To bridge the gap of generative methods for imbalanced and numerical thermal comfort datasets, and building on previous work [41] , we propose comfortGAN, a conditional Wasserstein GAN with gradient penalty (cWGAN-GP) as a class balancing algorithm for data-driven thermal comfort modeling instead of commonly used methods. We assessed the performance of a balanced thermal comfort dataset, composed of generated and real samples, on a multi-class classification model, on scenarios where comfort feedback can take as much as seven distinct values, as well as a reduced version with only three possible values.…”
Section: Related Work and Noveltymentioning
confidence: 99%
“…Subsequent modifications on WGAN, known as WGAN-gradient penalty (WGAN-GP) [21], enhances training stability and have shown better results and convergence in practice compared to conventionally used image-based GAN variants (e.g., convolutional GANs), specifically on tabular data from other fields [46,48]. Therefore, we move away from the vanilla architecture used in [41] and on this work, we use the WGAN-GP loss variant for comfortGAN.…”
Section: Customized Gan For Thermal Comfortmentioning
confidence: 99%