The traditional model of grey nearness degree of incidence contains some inherent limitations in the calculation of data sequences. It does not consider the impacts of certain data on degree of incidence when there are significant differences in orders of magnitude between adjacent data in the same sequence, and big errors may occur in the calculation of some special oscillation sequences. In response to these problems, we propose a new improved method, which uses the characteristics of the model of grey nearness degree of incidence and introduces a neural network algorithm to define a grey neural network-nearness degree of incidence. Thereby, a model of nearness degree of incidence is established based on grey neural network. Then we apply a new model to the field of data mining. According to the clustering algorithm, we take all the degrees of incidence as the variables of the distance metric function, and use the clustering algorithm of data mining for data analysis. Finally, through simulation experiments, we verify the effectiveness of the clustering algorithm under the new distance metric definition. The experimental results show that, compared with other methods, the computational outcomes of the improved model are more consistent with the actual situation. The cluster algorithm with the model used can deliver results that have a high accuracy, so the new model can be applicated in a wide range of fields.INDEX TERMS Grey neural network, nearness degree of incidence, data mining, clustering algorithm.
The traditional K-means algorithm has been widely used in cluster analysis. However, the algorithm only involves the distance factor as the only constraint, so there is a problem of sensitivity to special data points. To address this problem, in the process of K-means clustering, ambiguity is introduced as a new constraint condition. Hence, a new membership Equation is proposed on this basis, and a method for solving the initial cluster center points is given, so as to reduce risks caused by random selection of initial points. Besides, an optimized clustering algorithm with Gaussian distribution is derived with the utilization of fuzzy entropy as the cost function constraint. Compared with the traditional clustering method, the new EquationâȂŹs membership degree can reflect the relationship between a certain point and the set in a clearer way, and solve the problem of the traditional K-means algorithm that it is prone to be trapped in local convergence and easily influenced by noise. Experimental verification proves that the new method has fewer iterations and the clustering accuracy is better than other methods, thus having a better clustering effect. INDEX TERMS K-means; fuzzy entropy; cluster center; membership degree; Fuzzy clustering I. INTRODUCTION 1 The clustering process is the most effective classification 2 method for people to summarize complex external informa-3 tion [1]. Though classification can see a mature development 4 now, there are still challenges for the clustering algorithm 5 regarding how to eventually realize cognition, learning and 6 classification under unsupervised conditions by extracting 7 data features [2]. No model can be used universally and 8 achieve better results, since it is not a priori [3]. Data imply 9 enormous scientific and commercial values [4], especially 10 in the explosive growth of data production in recent years. 11 In 2016, the global data volume reached 10ZB and main-12 tained an annual growth rate of more than 40Scattered raw 13 data, processed with data mining technology, can deliver 14 valuable results, such as the planning of humanities and 15 the construction of biological sciences in the reference [6]-16 [8]. This type of research is of great significance for both 17 social development and human self-cognition and learning 18 cognition. It can be clearly seen that clustering research on 19 various types of data has attracted academic attention for a 20 long time [9].
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.