2020
DOI: 10.1016/j.jpdc.2020.01.005
|View full text |Cite
|
Sign up to set email alerts
|

Modeling I/O performance variability in high-performance computing systems using mixture distributions

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
6
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3

Relationship

3
5

Authors

Journals

citations
Cited by 11 publications
(6 citation statements)
references
References 14 publications
0
6
0
Order By: Relevance
“…We focus on the prediction of the standard deviation of the throughput in this paper. Xu et al (2020) show that the distribution of the throughput is multi-modal and thus it is complicated. A more ambitious goal is to predict the system throughput distribution generally (e.g., Lux et al 2018).…”
Section: System Optimization Resultsmentioning
confidence: 99%
“…We focus on the prediction of the standard deviation of the throughput in this paper. Xu et al (2020) show that the distribution of the throughput is multi-modal and thus it is complicated. A more ambitious goal is to predict the system throughput distribution generally (e.g., Lux et al 2018).…”
Section: System Optimization Resultsmentioning
confidence: 99%
“…In particular, the KS distance can be misleading when a CDF F (x) has a steep behavior (i.e., throughputs have multiple modes), and Xu et al (2020) show that multimodal behaviors commonly exist through the IOzone data. The KS distance measures the maximal error while EL 1 provides an average discrepancy.…”
Section: Conclusion and Areas For Future Researchmentioning
confidence: 99%
“…For example, Cameron et al (2019) study the standard deviation of the IOzone throughput. Xu et al (2020) show that the throughput distribution is multimodal so a summary statistic like standard deviation cannot represent the system variability. As an illustration, Figure 1(b) shows the histograms of the I/O throughput under four specific HPC system configurations.…”
Section: Introductionmentioning
confidence: 99%
“…In the data collection stage, researchers identify HPC system settings for which I/O throughput data should be collected. Computer scientists often use grid-based designs (GBDs) to collect data under numerous possible system configurations, when the number of factors is relatively small (Cameron et al 2019, Xu et al 2020. Note that the GBDs are equivalent to full factorial designs.…”
mentioning
confidence: 99%
“…Another commonly-used statistical approximation model is Gaussian process (GP) regression (e.g., Sacks et al 1989, Currin et al 1991, which can generate a smooth surface and be capable of dealing with the heteroscedasticity (Goldberg, Williams, and Bishop 1998) in the response variable. In the HPC community, mixture models have been used to study the multimodal behavior of the throughput distribution (Xu et al 2020). Some novel numerical techniques, including max box mesh, iterative box mesh, and Voronoi mesh methods for interpolation, are investigated by .…”
mentioning
confidence: 99%