Abstract. Nowadays, flash-based solid state drives (SSDs) are gradually replacing hard disk drives (HDDs) as the primary non-volatile storage in both desktop and enterprise applications because of their potential to speed up performance and reduce power consumption. However, database query processing engines are designed based on the fundamental characteristics of HDDs, so they may not benefit immediately from SSDs. Previous researches on optimizing database query processing on SSDs have mainly focused on leveraging the high random data access performance of SSDs and avoiding slow random writes whenever possible. However, they fail to exploit the rich internal parallelism of SSDs. In this paper, we focus on exploiting rich internal parallelism of SSDs to optimize scan and join operators. Firstly, we detect internal parallelism of SSDs seemed as black boxes. Then we propose a parallel table scan operator called ParaScan to take full advantage of internal parallelism of SSDs. Based on ParaScan, we also present an efficient parallel join operator called ParaHashJoin to accelerate database query processing. Experimental results on TPC-H datasets show that our ParaScan on SSD significantly outperforms the traditional table scan on SSD by 1X, and ParaHashJoin is 1.5X faster than traditional hash join operator especially when join selectivity is small.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.