Data abundance poses the need for powerful and easy-to-use tools that support processing large amounts of data. MapReduce has been increasingly adopted for over a decade by many companies, and more recently, it has attracted the attention of an increasing number of researchers in several areas. One main advantage is that the complex details of parallel processing, such as complex network programming, task scheduling, data placement, and fault tolerance, are hidden in a conceptually simple framework. MapReduce is supported by mature software technologies for deployment in data centers such as Hadoop. As MapReduce becomes popular for high-performance applications, many questions arise concerning its performance and efficiency.In this paper, we demonstrated formally lower bounds on the isoefficiency function for MapReduce applications, when these applications can be modeled as BSP jobs. We also demonstrate how communication and synchronization costs can be dominant for MapReduce computations and discuss the conditions under which such scalability limits are valid. To our knowledge, this is the first study that demonstrates scalability bounds for MapReduce applications. We also discuss how some MapReduce implementations such as Hadoop can mitigate such costs to approach linear, or near-to-linear speedups.
A key feature in virtualization technology is the Live Migration, which allows a Virtual Machine (VM) to be moved from a physical host to another without execution interruption. This feature enables the implementation of more sophisticated policies inside a cloud environment, such as energy and computational resources optimization, and improvement of quality-of-service. However live migration can impose severe performance degradation for the VM application and cause multiple impacts in service provider infrastructure, such as network congestion and colocated VM performance degradation. Different of several studies we consider the VM workload an important factor and we argue that carefully choosing a proper moment to migrate a VM can reduce the live migration penalties. This paper introduces a method to identify the workload cycles of a VM and based on that information it can postpone a Live Migration. In our experiments, using relevant benchmarks the proposed method was able to reduce up to 43% of network data transfer and reduce up to 74% of live migration time when compared to traditional consolidation strategies that perform live migration without considering the VM workload.
A análise do genoma é uma área com amplas pesquisas que permitem o estudo de doenças e o desenvolvimento de novos tratamentos. Para isso, pesquisadores utilizam-se do genoma montado através de ferramentas computacionais para realizar sua análise. Este trabalho apresenta uma análise de desempenho acerca de um algoritmo de correção hı́brida de sequências genômicas, sendo esta uma etapa necessária para a montagem do genoma. Foram implementadas sete versões do algoritmo visando comparar seus desempenhos. Os resultados obtidos a partir dos testes revelam que é possı́vel obter ganhos de desempenho de até cerca de 17 vezes em relação à versão sequencial, e que a melhor versão do algoritmo possui escalabilidade superior à linear.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.