Abstract.Correlations between locally averaged host observations, at different times and places, hint at information about the associations between the hosts in a network. These smoothed, pseudo-continuous time-series imply relationships with entities in the wider environment. For anomaly detection, mining this information might provide a valuable source of observational experience for determining comparative anomalies or rejecting false anomalies. The difficulties with distributed analysis lie in collating the distributed data and in comparing observables on different hosts, in different frames of reference. In the present work, we examine two methods (Principle Component Analysis and Eigenvector Centrality) that shed light on the usefulness of comparing data destined for different locations in a network.
Abstract-Cloud Computing (CC) is becoming increasingly pertinent and popular. A natural consequence of this is that many modern-day data centers experience very high internal traffic within the data centers themselves. The VMs with high mutual traffic often end up being far apart in the data center network, forcing them to communicate over unnecessarily long distances. The consequent traffic bottlenecks negatively affect both the performance of the application and the network in its entirety, posing nontrivial challenges for the administrators of these cloudbased data centers. The problem can, quite naturally, be compartmentalized into two phases which follow each other. First of all, the VMs are consolidated with a VM clustering algorithm, and this is achieved by utilizing the toolbox involving Learning Automata (LA). By mapping the clustering problem onto the Graph Partitioning (GP) problem, our LAbased solution successfully reduces the total communication cost by amounts that range between 34% to 85%. Thereafter, the resulting clusters are assigned to the server racks using a cluster placement algorithm that involves a completely different intelligent strategy, i.e., one that invokes Simulated Annealing (SA). This phase further reduces the total cost of communication by amounts that range between 89% to 99%. The analysis and results for different models and topologies demonstrate that the optimization is done in a fast and computationally-efficient way. Indeed, as far as we know, this paper pioneers the application of LA in the traffic-aware consolidation of virtual machines in data centers, and also pioneers a strategy which serializes the tools in LA and SA to optimize CC.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.