With increasing focus on inter-operability across cloud offerings to leverage their disparate capabilities, it has become more and more important to enable a flexible framework for sharing of heterogeneous resources in the cloud infrastructure. At the same time, it is imperative to be aware of the performance implications of hosting application workloads on different resources in order to guarantee Service Level Agreements (SLAs) to the applications. This paper focusses on experimental characterization of performance implications of different heterogeneous resources in hosting big-data analytics application workloads (one of the most critical applications in modern times). To create the knowledge, based on which the recommendations are provided, we benchmark the performance of big-data analytics applications, using a Hadoop cluster setup. Specifically, we study parameters of interest such as turnaround time and throughput, which are most likely to influence our choice of infrastructure for a particular application. Our experiments are conducted on varied platforms, both internal to Xerox and external cloud providers. We present a model based on our experiments, that facilitates the characterization of hetergeneous applications, thus enabling the cloud middleware to select an appropriate infrastructure and metrics in order to attain the desired SLA.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.