The adoption of cloud computing facilities and programming models differs vastly between different application domains. Scalable web applications, low-latency mobile backends and on-demand provisioned databases are typical cases for which cloud services on the platform or infrastructure level exist and are convincing when considering technical and economical arguments. Applications with specific processing demands, including high-performance computing, high-throughput computing and certain flavours of scientific computing, have historically required special configurations such as compute-or memory-optimised virtual machine instances. With the rise of function-level compute instances through Function-as-a-Service (FaaS) models, the fitness of generic configurations needs to be re-evaluated for these applications. We analyse several demanding computing tasks with regards to how FaaS models compare against conventional monolithic algorithm execution. Beside the comparison, we contribute a refined FaaSification process for legacy software and provide a roadmap for future work.
The Cloud Computing paradigm is focused on the provisioning of reliable and scalable virtual infrastructures that deliver execution and storage services. This paradigm is particularly suitable to solve resource-greedy scientific computing applications such as parameter sweep experiments (PSEs). Through the implementation of autoscalers, the virtual infrastructure can be scaled up and down by acquiring or terminating instances of virtual machines (VMs) at the time that application tasks are being scheduled. In this paper, we extend an existing study centered in a state-of-the-art autoscaler called multiobjective evolutionary autoscaler (MOEA). MOEA uses a multiobjective optimization algorithm to determine the set of possible virtual infrastructure settings. In this context, the performance of MOEA is greatly influenced by the underlying optimization algorithm used and its tuning. Therefore, we analyze two well-known multiobjective evolutionary algorithms (NSGA-II and NSGA-III) and how they impact on the performance of the MOEA autoscaler. Simulated experiments with three real-world PSEs show that MOEA gets significantly improved when using NSGA-III instead of NSGA-II due to the former provides a better exploitation versus exploration trade-off.
The adequate management of scientific workflow applications strongly depends on the availability of accurate performance models of sub-tasks. Numerous approaches use machine learning to generate such models autonomously, thus alleviating the human effort associated to this process. However, these standalone models may lack robustness, leading to a decay on the quality of information provided to workflow systems on top. This paper presents a novel approach for learning ensemble prediction models of tasks runtime. The ensemble-learning method entitled bootstrap aggregating (bagging) is used to produce robust ensembles of M5P regression trees of better predictive performance than could be achieved by standalone models. Our approach has been tested on gene expression analysis workflows. The results show that the ensemble method leads to significant prediction-error reductions when compared with learned standalone models. This is the first initiative using ensemble learning for generating performance prediction models. These promising results encourage further research in this direction.
Given mobile devices ubiquity and capabilities, some researchers now consider them as resource providers of distributed environments called mobile Grids for running resource intensive software. Therefore, job scheduling has to deal with device singularities, such as energy constraints, mobility and unstable connectivity. Many existing schedulers consider at least one of these aspects, but their applicability strongly depends on information that is unavailable or difficult to estimate accurately, like job execution time. Other efforts do not assume knowing job CPU requirements but ignore energy consumption due to data transfer operations, which is not realistic for data-intensive applications. This work, on the contrary, considers the last as non negligible and known by the scheduler. Under these assumptions, we conduct a performance study of several traditional scheduling heuristics adapted to this environment, which are applied with the known information of jobs but evaluated along with job information unknown to the scheduler. Experiments are performed via a simulation software that employs hardware profiles derived from real mobile devices. Our goal is to contribute to better understand both the capabilities and limitations of this kind of schedulers in the incipient area of mobile Grids.
Cloud Computing is becoming the leading paradigm for executing scientific and engineering workflows. The large-scale nature of the experiments they model and their variable workloads make clouds the ideal execution environment due to prompt and elastic access to huge amounts of computing resources. Autoscalers are middleware-level software components that allow scaling up and down the computing platform by acquiring or terminating virtual machines (VM) at the time that workflow's tasks are being scheduled. In this work we propose a novel online multi-objective autoscaler for workflows denominated Cloud Multi-objective Intelligence (CMI), that aims at the minimization of makespan, monetary cost and the potential impact of errors derived from unreliable VMs. In addition, this problem is subject to monetary budget constraints. CMI is responsible for periodically solving the autoscaling problems encountered along the execution of a workflow. Simulation experiments on four well-known workflows exhibit that CMI significantly outperforms a state-of-the-art autoscaler of similar characteristics called Spot Instances Aware Autoscaling (SIAA). These results convey a solid base for deepening in the study of other meta-heuristic methods for autoscaling workflow applications using cheap but unreliable infrastructures.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.