Abstract:The advent of big data analytics and cloud computing technologies has resulted in wide-spread research in finding solutions to the data placement problem, which aims at properly placing the data items into distributed datacenters. Although traditional schemes of uniformly partitioning the data into distributed nodes is the defacto standard for many popular distributed data stores like HDFS or Cassandra, these methods may cause network congestion for data-intensive services, thereby affecting the system throughput. This is because as opposed to MapReduce style workloads, data-intensive services require access to multiple datasets within each transaction. In this paper, we propose a scalable method for performing data placement of data-intensive services into geographically distributed clouds. The proposed algorithm partitions a set of data-items into geodistributed clouds using spectral clustering on hypergraphs. Additionally, our spectral clustering algorithm leverages randomized techniques for obtaining low-rank approximations of the hypergraph matrix, thereby facilitating superior scalability for computation of the spectra of the hypergraph laplacian. Experiments on a real-world trace-based online social network dataset show that the proposed algorithm is effective, efficient, and scalable. Empirically, it is comparable or even better (in certain scenarios) in efficacy on the evaluated metrics, while being up to 10 times faster in running time when compared to state-of-the-art techniques.
Unikernels are a relatively recent way to create and quickly deploy extremely small virtual machines that do not require as much functional and operational software overhead as containers or virtual machines by leaving out unnecessary parts. This paradigm aims to replace bulky virtual machines on one hand, and to open up new classes of hardware for virtualization and networking applications on the other. In recent years, the tool chains used to create unikernels have grown from proof of concept to platforms that can run both new and existing software written in various programming languages. This paper studies the performance (both execution time and memory footprint) of unikernels versus Docker containers in the context of REST services and heavy processing workloads, written in Java, Go, and Python. With the results of the performance evaluations, predictions can be made about which cases could benefit from the use of unikernels over containers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.