There has been considerable growth and interest in industrial applications of machine learning (ML) in recent years. ML engineers, as a consequence, are in high demand across the industry, yet improving the efficiency of ML engineers remains a fundamental challenge. Automated machine learning (AutoML) has emerged as a way to save time and effort on repetitive tasks in ML pipelines, such as data pre-processing, feature engineering, model selection, hyperparameter optimization, and prediction result analysis. In this paper, we investigate the current state of AutoML tools aiming to automate these tasks. We conduct various evaluations of the tools on many datasets, in different data segments, to examine their performance, and compare their advantages and disadvantages on different test cases.
The era of big data has led to the emergence of new systems for real-time distributed stream processing, e.g., Apache Storm is one of the most popular stream processing systems in industry today. However, Storm, like many other stream processing systems lacks an intelligent scheduling mechanism. The default round-robin scheduling currently deployed in Storm disregards resource demands and availability, and can therefore be inefficient at times. We present R-Storm (Resource-Aware Storm), a system that implements resourceaware scheduling within Storm. R-Storm is designed to increase overall throughput by maximizing resource utilization while minimizing network latency. When scheduling tasks, R-Storm can satisfy both soft and hard resource constraints as well as minimizing network distance between components that communicate with each other. We evaluate R-Storm on set of micro-benchmark Storm applications as well as Storm applications used in production at Yahoo! Inc. From our experimental results we conclude that R-Storm achieves 30-47% higher throughput and 69-350% better CPU utilization than default Storm for the micro-benchmarks. For the Yahoo! Storm applications, R-Storm outperforms default Storm by around 50% based on overall throughput. We also demonstrate that R-Storm performs much better when scheduling multiple Storm applications than default Storm.
As the use of cloud computing resources grows in academic research and industry, so does the likelihood of failures that catastrophically affect the applications being run on the cloud. For that reason, cloud service providers as well as cloud applications need to expect failures and shield their services accordingly. We propose a new model called Failure Scenario as a Service (FSaaS). FSaaS will be utilized across the cloud for testing the resilience of cloud applications. In an effort to provide both Hadoop service and application vendors with the means to test their applications against the risk of massive failure, we focus our efforts on the Hadoop platform. We have generated a series of failure scenarios for certain types of jobs. Customers will be able to choose specific scenarios based on their jobs to evaluate their systems.
Navigation within the physical library building can be supported with mobile computing technology; specifically, a path suggestion software application on a patron's mobile device can direct her to the location of the physical item on the shelf. This is accomplished by leveraging existing WiFi access points within a library building as well as supplementing wireless infrastructures with additional wireless beacons for collections-based wayfinding.
This paper considers mission assurance for critical cloud applications, a set of applications with growing importance to governments and military organizations. Specifically, we consider applications in which assigned tasks or duties are performed in accordance with an intended purpose or plan in order to accomplish an assured mission. Mission-critical cloud computing may possibly involve hybrid (public, private, heterogeneous) clouds and require the realization of "end-to-end" and "cross-layered" security, dependability, and timeliness. We propose the properties and building blocks of a middleware for assured cloud computing that can support critical missions. In this approach, we assume that mission critical cloud computing must be designed with assurance in mind. In particular, the middleware in such systems must include sophisticated monitoring, assessment of policies, and response to manage the configuration and management of dynamic systems-of-systems with both trusted and partially trusted resources (data, sensors, networks, computers, etc.) and services sourced from multiple organizations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.