Computational storage: an efficient and scalable platform for big data and HPC applications

Torabzadehkashi, Mahdi; Rezaei, Siavash; HeydariGorji, Ali; Bobarshad, Hossein; Alves, V. Castro; Bagherzadeh, Nader

doi:10.1186/s40537-019-0265-5

Cited by 26 publications

(7 citation statements)

References 32 publications

(30 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It presents several advantages, such as scalability, flexibility, cost-effectiveness, organized architecture, and resilience to failure. Recently, several Hadoop-based platforms have been proposed as efficient and flexible solutions for computational storage with the strategy of processing data close to where they reside [53][54][55]. Among them, we cite the lineage-aware data management (LDM) that exploits the data locality to decrease the network footprint [56,57].…”

Section: Mapreduce Model Of Computationmentioning

confidence: 99%

Efficient parallel derivation of short distinguishing sequences for nondeterministic finite state machines using MapReduce

et al. 2021

View full text Add to dashboard Cite

Distinguishing sequences are widely used in finite state machine-based conformance testing to solve the state identification problem. In this paper, we address the scalability issue encountered while deriving distinguishing sequences from complete observable nondeterministic finite state machines by introducing a massively parallel MapReduce version of the well-known Exact Algorithm. To the best of our knowledge, this is the first study to tackle this task using the MapReduce approach. First, we give a concise overview of the well-known Exact Algorithm for deriving distinguishing sequences from nondeterministic finite state machines. Second, we propose a parallel algorithm for this problem using the MapReduce approach and analyze its communication cost using Afrati et al. model. Furthermore, we conduct a variety of intensive and comparative experiments on a wide range of finite state machine classes to demonstrate that our proposed solution is efficient and scalable.

show abstract

Section: Mapreduce Model Of Computationmentioning

confidence: 99%

Efficient parallel derivation of short distinguishing sequences for nondeterministic finite state machines using MapReduce

et al. 2021

View full text Add to dashboard Cite

show abstract

“…One unique feature with our CSD is that the CBDD supports file-system access by the ISP applications and the host. By the embedded Linux to mount partitions on the storage and present a file API, this makes it not only easier to port applications but also more efficient to access the data [18], [19]. The host CPU and ISP units can also communicate via the same partition, which is mounted by both the host and the ISP.…”

Section: B System Software On the Csdmentioning

confidence: 99%

In-storage Processing of I/O Intensive Applications on Computational Storage Drives

HeydariGorji¹,

Torabzadehkashi²,

Rezaei³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Computational storage drives (CSD) are solid-state drives (SSD) empowered by general-purpose processors that can perform in-storage processing. They have the potential to improve both performance and energy significantly for big-data analytics by bringing compute to data, thereby eliminating costly data transfer while offering better privacy. In this work, we introduce Solana, the first-ever high-capacity(12-TB) CSD in E1.S form factor, and present an actual prototype for evaluation. To demonstrate the benefits of in-storage processing on CSD, we deploy several natural language processing (NLP) applications on datacenter-grade storage servers comprised of clusters of the Solana. Experimental results show up to 3.1x speedup in processing while reducing the energy consumption and data transfer by 67% and 68%, respectively, compared to regular enterprise SSDs.

show abstract

“…While the first approach is powerand cost-efficient, it provides limited processing power, and also it can negatively impact the performance of the SSD controller. In the previous works focusing on the second approach, researchers have investigated different types of dedicated processing engines such as FPGAs [14] and embedded processors [23,24]. Although FPGAs are power efficient, they have several limitations for being used in ISPs, including the challenging step of implementing an RTL design of the tasks [18], the time-consuming step of reconfiguration of FPGAs for running different tasks, and the lack of supporting file systems to allow tasks to deal with the concept of file when accessing the storage.…”

Section: In-storage Processingmentioning

confidence: 99%

HyperTune

HeydariGorji

Rezaei

Torabzadehkashi

et al. 2020

Proceedings of the 39th International Conference on Computer-Aided Design

Self Cite

View full text Add to dashboard Cite

Computational storage: an efficient and scalable platform for big data and HPC applications

Cited by 26 publications

References 32 publications

Efficient parallel derivation of short distinguishing sequences for nondeterministic finite state machines using MapReduce

Efficient parallel derivation of short distinguishing sequences for nondeterministic finite state machines using MapReduce

In-storage Processing of I/O Intensive Applications on Computational Storage Drives

HyperTune

Contact Info

Product

Resources

About