Computing Depth-First Search (DFS) results, i.e. depth-first order or DFS-Tree, on the semi-external environment becomes a hot topic, because the scales of the graphs grow rapidly which can hardly be hold in the main memory, in the big data era. Existing semi-external DFS algorithms assume the main memory could, at least, hold a spanning tree T of a graph G, and gradually restructure T into a DFS-Tree, which is non-trivial. In this paper, we present a comprehensive study of semi-external DFS problem, including the first theoretical analysis of the main challenge of this problem, as far as we know. Besides, we introduce a new semi-external DFS algorithm with an efficient edge pruning principle, named EP-DFS. Unlike the traditional algorithms, we not only focus on addressing such complex problem efficiently with less I/Os, but also focus on that with simpler CPU calculation (Implementation-friendly) and less random I/O access (key-to-efficiency). The former is based on our efficient pruning principle; the latter is addressed by a lightweight index N + -index, which is a compressed storage for a subset of the edges for G. The extensive experimental evaluation on both synthetic and real graphs confirms that our EP-DFS algorithm outperforms the existing techniques.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.