The development of knowledge graph needs the support of a vast quantity of data. However, the amount of data increases rapidly is placing increasing demands on machines. Centralized data storage requires high-performance hosts to store data, which is costly and have single point of failure. Distributed data storage can reduce the cost of the machine greatly, and there is no single point of failure, but it has requirements for partition and storage of data collection. In the knowledge storage of specific domain, the way of graph data partition and storage vary from the different domain knowledge. To solve the above problems, a scheme of graph partition and distributed storage for domain-specific knowledge graphs is proposed. The proposed graph partition scheme pays attention to the correlation between the data, and divides the nodes affiliated each other into the same or similar partition. A distributed aggregation storage scheme is designed, which makes full use of cluster performance and solves the problem of data consistency during data insertion and update. The proposed distributed storage scheme based on HBase combines Neo4j to realize visual query effectively. Experimental results show the efficiency and the effectiveness of the proposed method in partition time, the number of edge-cut and update time.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.