Carlos H. Ã. Costa scite author profile

Many studies point to the difficulty of scaling existing computer architectures to meet the needs of an exascale system (i.e., capable of executing 10 18 floating-point operations per second), consuming no more than 20 MW in power, by around the year 2020. This paper outlines a new architecture, the Active Memory Cube, which reduces the energy of computation significantly by performing computation in the memory module, rather than moving data through large memory hierarchies to the processor core. The architecture leverages a commercially demonstrated 3D memory stack called the Hybrid Memory Cube, placing sophisticated computational elements on the logic layer below its stack of dynamic random-access memory (DRAM) dies. The paper also describes an Active Memory Cube tuned to the requirements of a scientific exascale system. The computational elements have a vector architecture and are capable of performing a comprehensive set of floating-point and integer instructions, predicated operations, and gather-scatter accesses across memory in the Cube. The paper outlines the software infrastructure used to develop applications and to evaluate the architecture, and describes results of experiments on application kernels, along with performance and power projections.

show abstract

A massively parallel infrastructure for adaptive multiscale simulations

Natale

Bhatia

Carpenter

et al. 2019

View full text Add to dashboard Cite

Deep Sequencing Reveals Occult Mansonellosis Coinfections in Residents From the Brazilian Amazon Village of São Gabriel da Cachoeira

Crainey

Costa

Leles

et al. 2020

View full text Add to dashboard Cite

Mansonella ozzardi and Mansonella perstans infections both cause mansonellosis but are usually treated differently. Using a real-time polymerase chain reaction assay and deep sequencing, we reveal the presence of mansonellosis coinfections that were undetectable by standard diagnostic methods. Our results confirm mansonellosis coinfections and have important implications for the disease’s treatment and diagnosis.

show abstract

Leveraging Adaptive I/O to Optimize Collective Data Shuffling Patterns for Big Data Analytics

Nicolae

Costa

Misale³

et al. 2017

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

Abstract-Big data analytics is an indispensable tool in transforming science, engineering, medicine, health-care, finance and ultimately business itself. With the explosion of data sizes and need for shorter time-to-solution, in-memory platforms such as Apache Spark gain increasing popularity. In this context, data shuffling, a particularly difficult transformation pattern, introduces important challenges. Specifically, data shuffling is a key component of complex computations that has a major impact on the overall performance and scalability. Thus, speeding up data shuffling is a critical goal. To this end, state-of-the-art solutions often rely on overlapping the data transfers with the shuffling phase. However, they employ simple mechanisms to decide how much data and where to fetch it from, which leads to sub-optimal performance and excessive auxiliary memory utilization for the purpose of prefetching. The latter aspect is a growing concern, given evidence that memory per computation unit is continuously decreasing while interconnect bandwidth is increasing. This paper contributes a novel shuffle data transfer strategy that addresses the two aforementioned dimensions by dynamically adapting the prefetching to the computation. We implemented this novel strategy in Spark, a popular in-memory data analytics framework. To demonstrate the benefits of our proposal, we run extensive experiments on an HPC cluster with large core count per node. Compared with the default Spark shuffle strategy, our proposal shows: up to 40% better performance with 50% less memory utilization for buffering and excellent weak scalability.

show abstract

Molecular detection of Mansonella mariae incriminates Simulium oyapockense as a potentially important bridge vector for Amazon-region zoonoses

Silva

Narzetti

Crainey³

et al. 2022

Infection, Genetics and Evolution

View full text Add to dashboard Cite

SparkGA

Mushtaq

Liu

Costa

et al. 2017

View full text Add to dashboard Cite

A System Software Approach to Proactive Memory-Error Avoidance

Costa

Park

Rosenburg

et al. 2014

View full text Add to dashboard Cite

You Only Run Once: Spark Auto-Tuning From a Single Run

Buchaca

Portella

Costa

et al. 2020

IEEE Trans. Netw. Serv. Manage.

View full text Add to dashboard Cite

Tuning configurations of Spark jobs is not a trivial task. State-of-the-art auto-tuning systems are based on iteratively running workloads with different configurations. During the optimization process, the relevant features are explored to find good solutions. Many optimizers enhance the time-to-solution using black-box optimization algorithms that do not take into account any information from the Spark workloads. In this paper, we present a new method for tuning configurations that uses information from one run of a Spark workload. To achieve good performance, we mine the SparkEventLog that is generated by the Spark engine. This log file contains a large amount of information from the executed application. We use this information to enhance a performance model with low-level features from the workload to be optimized. These features include Spark Actions, Transformations, and Task metrics. This process allows us to obtain application-specific workload information. With this information our system can predict sensible Spark configurations for unseen jobs, given that it has been trained with reasonable coverage of Spark applications. Experiments show that the presented system correctly produces good configurations, while achieving up to 80% speedup with respect to the default Spark configuration, and up to 12x speedup of the time-to-solution with respect to a standard Bayesian Optimization procedure.

show abstract

12 3 4

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Carlos H. Ã. Costa

Active Memory Cube: A processing-in-memory architecture for exascale systems

A massively parallel infrastructure for adaptive multiscale simulations

Deep Sequencing Reveals Occult Mansonellosis Coinfections in Residents From the Brazilian Amazon Village of São Gabriel da Cachoeira

Leveraging Adaptive I/O to Optimize Collective Data Shuffling Patterns for Big Data Analytics

Molecular detection of Mansonella mariae incriminates Simulium oyapockense as a potentially important bridge vector for Amazon-region zoonoses

SparkGA

A System Software Approach to Proactive Memory-Error Avoidance

You Only Run Once: Spark Auto-Tuning From a Single Run

Contact Info

Product

Resources

About