33rd International Conference on Scientific and Statistical Database Management 2021
DOI: 10.1145/3468791.3468806
|View full text |Cite
|
Sign up to set email alerts
|

DJEnsemble: a Cost-Based Selection and Allocation of a Disjoint Ensemble of Spatio-temporal Models

Abstract: Consider a set of black-box models -each of them independently trained on a different dataset -answering the same predictive spatio-temporal query. Being built in isolation, each model traverses its own life-cycle until it is deployed to production, learning data patterns from different datasets and facing independent hyperparameter tuning. In order to answer the query, the set of black-box predictors has to be ensembled and allocated to the spatio-temporal query region. However, computing an optimal ensemble … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2

Relationship

3
3

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 16 publications
(10 reference statements)
0
7
0
Order By: Relevance
“…Clipper [14] proposes a strategy to select models for ensemble inference by modeling the scenario as a multi-armed bandit problem. DJEnsemble [36] presents a cost-based approach for the automatic selection of black-box models to answer spatio-temporal queries. We do not consider such approaches in this work, mainly because they all require labeled data from the target domain for model selection, which could become a significant burden that complicates the model deployment phase.…”
Section: Related Workmentioning
confidence: 99%
“…Clipper [14] proposes a strategy to select models for ensemble inference by modeling the scenario as a multi-armed bandit problem. DJEnsemble [36] presents a cost-based approach for the automatic selection of black-box models to answer spatio-temporal queries. We do not consider such approaches in this work, mainly because they all require labeled data from the target domain for model selection, which could become a significant burden that complicates the model deployment phase.…”
Section: Related Workmentioning
confidence: 99%
“…The assumption is that by combining different models, the weaknesses of each one are compensated by the strengths of the others. However, DJEnsemble takes a slightly different approach [Pereira et al 2021]. As the traditional ensemble approach, it considers a set of available trained models M = {M 1 , M 2 , .…”
Section: Djensemble Approachmentioning
confidence: 99%
“…To analyze the algorithm performance when integrated to SAVIME, we performed a series of experiments evaluating the execution time of the different steps. For reference, we also measured the offline step execution time (already presented in [Pereira et al 2021]). For our experiments, we built a dataset from rain data from the city of Rio de Janeiro, provided by 33 pluviometrical stations.…”
Section: Experimental Evaluationmentioning
confidence: 99%
See 1 more Smart Citation
“…Regarding massive data processing and model training, in [Mirzasoleiman 2021] are discussed techniques for dataset characterization in a reduced number of representatives elements, with data-efficient methods to extract representative subsets that generalize the full data. Finally, DJEnsemble [Pereira et al 2021] investigates the prediction of spatio-temporal phenomena using deep-learning models; leveraging statistical properties of the t.s. to generate tiles in contrast of our shape-based approach.…”
Section: Evaluation Of the Classifier For Model Selectionmentioning
confidence: 99%