As graph analytics often involves compute-intensive operations, GPUs have been extensively used to accelerate the processing. However, in many applications such as social networks, cyber security, and fraud detection, their representative graphs evolve frequently and one has to perform a rebuild of the graph structure on GPUs to incorporate the updates. Hence, rebuilding the graphs becomes the bottleneck of processing high-speed graph streams. In this paper, we propose a GPU-based dynamic graph storage scheme to support existing graph algorithms easily. Furthermore, we propose parallel update algorithms to support efficient stream updates so that the maintained graph is immediately available for high-speed analytic processing on GPUs. Our extensive experiments with three streaming applications on large-scale real and synthetic datasets demonstrate the superior performance of our proposed approach.
Personalized PageRank (PPR) is a well-known proximity measure in graphs. To meet the need for dynamic PPR maintenance, recent works have proposed a local update scheme to support incremental computation. Nevertheless, sequential execution of the scheme is still too slow for highspeed stream processing. Therefore, we are motivated to design a parallel approach for dynamic PPR computation. First, as updates always come in batches, we devise a batch processing method to reduce synchronization cost among every single update and enable more parallelism for iterative parallel execution. Our theoretical analysis shows that the parallel approach has the same asymptotic complexity as the sequential approach. Second, we devise novel optimization techniques to effectively reduce runtime overheads for parallel processes. Experimental evaluation shows that our parallel algorithm can achieve orders of magnitude speedups on GPUs and multi-core CPUs compared with the state-of-the-art sequential algorithm.
Feature extraction and classification are two important steps in the process of strip steel surface defect recognition. Traditional methods of defect feature extraction are not of scale and rotation invariance. Moreover, traditional methods of defect classification have a conflict between efficiency and accuracy in. In order to solve the above two problems, a novel recognition method is proposed in this paper. On one hand, the novel defect feature extraction scheme is realized by building sampling benchmark scale (SBS) information for training dataset and using gradient magnitude and gradient orientation co-occurrence matrix (GMGOCM), gray level and gradient orientation co-occurrence matrix (GLGOCM), and moment invariant features. On the other hand, K-nearest neighbor and R-nearest neighbor algorithms are used to prune training dataset, and amplification factors of pruned samples are used to improve least squares twin support vector machine (LSTWSVM) classifier in efficiency and accuracy. The experimental results show that the novel recognition method can not only realize defect feature extraction with scale and rotation invariance but also realize defect classification with high efficiency and accuracy.
Fault diagnosis for blast furnace is actually a multi-class classification problem because the blast furnace may appear usually many kinds of abnormal states. Moreover, those abnormal states should be monitored and diagnosed timely and what can help workers take effective measures. Support vector machine (SVM) is state-of-the-art for many classification problems currently. But many classification tasks involve imbalanced training examples in practice. Imbalanced dataset learning is an important practical issue in machine learning, especially in support vector machine (SVM). Fault diagnosis for blast furnace is such an imbalanced data problem. A novel algorithm named optional support vector machine is proposed to solve this imbalanced data classification by pruning training sets and adding the unlabeled data and applying edited nearest neighbor (ENN) rules. Firstly, training sets of majority class are pruned in order to reduce the training time. Secondly, the algorithm selects some useful unlabelled training data and adds them to the training sets. Those samples are used to replenish the lack of training samples so that the training sets are representative. However, they may contain some noisy examples. Finally, the edited nearest neighbor rule is removed the noisy examples. The algorithm adds the unlabelled (testing) samples to balance the number of samples between the minority class and the majority one. The real-time producing data of blast furnace are used to running experiment. In order to more accurately diagnose which kinds fault happened, a binary tree multi-class classification method is adopted based on blast furnace characteristics. Simulation results show that the proposed algorithm is feasible and effective.KEY WORDS: support vector machine (SVM); pruning training set; active learning; imbalanced data classification; edited nearest neighbor (ENN).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.