Abstract. Negative sequential patterns (NSP) refer to sequences with non-occurring and occurring items, and can play an irreplaceable role in understanding and addressing many business applications. However, some problems occur after mining NSP, the most urgent one of which is how to select the actionable positive or negative sequential patterns. This is due to the following factors: 1) positive sequential patterns (PSP) mined before considering NSP may mislead decisions; and 2) it is much more difficult to select actionable patterns after mining NSP, as the number of NSPs is much greater than PSPs. In this paper, an improved method of pruning uninteresting itemsets to fit for a selecting actionable sequential pattern (ASP) is proposed. Then, a novel and efficient method, called SAP, is proposed to select the actionable positive and negative sequential patterns. Experimental results indicate that SAP is very efficient in the selection of ASP. To the best of our knowledge, SAP is the best method for the selection of actionable positive and negative sequential patterns.
Negative sequential pattern (NSP), which contains both non-occurring and occurring items, is sometimes much more informative than positive sequential pattern (PSP) in many applications and intelligent systems. To date, very limited methods are available to mine NSP due to its intrinsic complexities. Furthermore, there is not a unified definition about negative containment, i.e., how a data sequence contains a negative sequence. The researchers who begin to study negative sequential pattern are often confused by these different definitions and are eager to know the differences among the existing methods. So in this paper, we select four typical existing methods, PNSP, Neg-GSP, e-NSP and NSPM, implement their algorithms by JAVA, and compare their definitions, methods, runtimes and the number of NSPs. Examples and experiments on four datasets clearly show their differences.
With the expansion of data scale and the increase in data complexity, it is particularly important to accurately identify clusters and efficiently save clustering results. To address this, we propose a novel clustering algorithm, Shape clustering based on data field (STATE), which can quickly identify clusters of arbitrary shapes and greatly reduce the storage space of clustering results in any datasets without reducing the accuracy. STATE mainly focuses on finding the edges of clusters and directions of edges instead of clustering centers through the data field. The results of STATE are presented as the edges of clusters without data objects inside clusters and without noise. Extensive experiments show that STATE can recognize complex data distribution in noisy environments without discrimination and greatly save the storage space of clustering results. When it is applied in a real‐world scene, facial feature extraction, STATE can recognize eyes, nose, mouth, eyebrows and facial contours automatically without calibrating key features or training. Using the extracted facial features, we achieve facial recognition with high accuracy.
Clustering is an unsupervised learning method widely used for identifying the inherent data structure and applied to various fields such as data mining, patter recognition, machine learning, and others. A new topological clustering method called δ‐open set clustering is proposed in this study. The key idea of this method is to determine δ‐open sets in data, for which each δ‐open set represents one specific category of data. It is shown that this method has robust performance even for complex data set. It can classify the complex type of data sets coming with diverse shapes, recognize noise and deal with data set of high dimensionality. This method is effective even when the distribution of data is unbalanced. In the clustering process, one requires a single input parameter, namely the value of δ. A face identification experiment on the Olivetti Face Database indicates that this method performs much more reliably than the peak clustering method. We also provide another improved δ‐open set clustering that makes δ‐open set clustering capable of handling clusters with extreme density difference. This article is categorized under: Technologies > Structure Discovery and Clustering Algorithmic Development > Structure Discovery
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.