Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering 2012
DOI: 10.1145/2393596.2393637
|View full text |Cite
|
Sign up to set email alerts
|

Scalable test data generation from multidimensional models

Abstract: Multidimensional data models form the core of modern decision support software. The need for this kind of software is significant, and it continues to grow with the size and variety of datasets being collected today. Yet real multidimensional instances are often unavailable for testing and benchmarking, and existing data generators can only produce a limited class of such structures. In this paper, we present a new framework for scalable generation of test data from a rich class of multidimensional models. The… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2013
2013
2024
2024

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 16 publications
(7 citation statements)
references
References 37 publications
0
7
0
Order By: Relevance
“…Another data generation approach from outside the UML community is TestBlox [79]. TestBlox aims to produce large, valid, and statistically representative data sets from multidimensional models to test the performance of big-data computing platforms.…”
Section: Related Workmentioning
confidence: 99%
“…Another data generation approach from outside the UML community is TestBlox [79]. TestBlox aims to produce large, valid, and statistically representative data sets from multidimensional models to test the performance of big-data computing platforms.…”
Section: Related Workmentioning
confidence: 99%
“…We note that many approaches generate large data sets to evaluate the performance of big data computing platforms [1,3,4,13,23,26,27]. These approaches are fundamentally different from test data generation approaches whose goal is to find faults that may exist in big data programs.…”
Section: Related Workmentioning
confidence: 99%
“…The reason is that the subtle correlations between attributes are often not captured. Another line of work [2,3,4,15,9,18] addresses this problem by considering a richer set of constraints, e.g., generating a database given a workload of queries such that each intermediate result has a certain size. They constraints are typically specified in a declarative language and the use of constraint solvers is very common in these works.…”
Section: Related Workmentioning
confidence: 99%
“…Recent work [2,9,18] has proposed generating workload-aware datasets with the help of constraint solvers. However, these do not scale well to the amounts of data typically present in a customer dataset.…”
Section: Introductionmentioning
confidence: 99%