Feature Linkage Weight Based Feature Reduction using Fuzzy Clustering Method

Malarvizhi, K.; Amshakala, K.

doi:10.3233/jifs-201395

Cited by 3 publications

(4 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The total number of Samples (30) [14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29]} P2{[0, 2,4,6,8,10,12,14,16,18,20,22,24,26,28], [1,3,5,7,9,11,13,15,17,19,21,23,25,27,29]} P3{[0, 1,2,3,4,5,6,7,8,…”

Section: All Predictions Correctly Number Of Samplesmentioning

confidence: 99%

“…In order to make clustering widely available in more fields, it can be applied to large-scale group decision-making [8,9]. Existing clustering algorithms mainly include hard clustering [10,11] and fuzzy clustering [12][13][14]. The former has only two membership degrees, 0 and 1, that is, each data object is strictly divided into a certain cluster; The mem-bership of the latter can have any values within the interval [0,1], that is, a data object can be classified into multiple clusters with different membership.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Collaborative possibilistic fuzzy clustering based on information bottleneck

Chen

Liu

2023

IFS

View full text Add to dashboard Cite

In fuzzy clustering algorithms, the possibilistic fuzzy clustering algorithm has been widely used in many fields. However, the traditional Euclidean distance cannot measure the similarity between samples well in high-dimensional data. Moreover, if there is an overlap between clusters or a strong correlation between features, clustering accuracy will be easily affected. To overcome the above problems, a collaborative possibilistic fuzzy clustering algorithm based on information bottleneck is proposed in this paper. This algorithm retains the advantages of the original algorithm, on the one hand, using mutual information loss as the similarity measure instead of Euclidean distance, which is conducive to reducing subjective errors caused by arbitrary choices of similarity measures and improving the clustering accuracy; on the other hand, the collaborative idea is introduced into the possibilistic fuzzy clustering based on information bottleneck, which can form an accurate and complete representation of the data organization structure based on make full use of the correlation between different feature subsets for collaborative clustering. To examine the clustering performance of this algorithm, five algorithms were selected for comparison experiments on several datasets. Experimental results show that the proposed algorithm outperforms the comparison algorithms in terms of clustering accuracy and collaborative validity.

show abstract

Section: All Predictions Correctly Number Of Samplesmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Collaborative possibilistic fuzzy clustering based on information bottleneck

Chen

Liu

2023

IFS

View full text Add to dashboard Cite

show abstract

“…Another different strategy is the nonparametric strategy which includes multiple techniques; one example is Random Forest [17]. Many researchers have used variable importance measurement strategies and applied them to enhance the classifier's performance, such as [18], naive Bayes text classifiers [19,20], the fuzzy clustering method, and feature weighting used for the neural network [21,22], with SVMs [23]. Also, feature weighting has been used as a feature selection strategy to know the influence of features on results and then exclude irrelevant, redundant features [24][25][26], as well as the information gain attribute [27].…”

Section: Introductionmentioning

confidence: 99%

A Novel Approach for Data Feature Weighting Using Correlation Coefficients and Min–Max Normalization

Shantal,

Othman,

Bakar

2023

Symmetry

View full text Add to dashboard Cite

In the realm of data analysis and machine learning, achieving an optimal balance of feature importance, known as feature weighting, plays a pivotal role, especially when considering the nuanced interplay between the symmetry of data distribution and the need to assign differential weights to individual features. Also, avoiding the dominance of large-scale traits is essential in data preparation. This step makes choosing an effective normalization approach one of the most challenging aspects of machine learning. In addition to normalization, feature weighting is another strategy to deal with the importance of the different features. One of the strategies to measure the dependency of features is the correlation coefficient. The correlation between features shows the relationship strength between the features. The integration of the normalization method with feature weighting in data transformation for classification has not been extensively studied. The goal is to improve the accuracy of classification methods by striking a balance between the normalization step and assigning greater importance to features with a strong relation to the class feature. To achieve this, we combine Min–Max normalization and weight the features by increasing their values based on their correlation coefficients with the class feature. This paper presents a proposed Correlation Coefficient with Min–Max Weighted (CCMMW) approach. The data being normalized depends on their correlation with the class feature. Logistic regression, support vector machine, k-nearest neighbor, neural network, and naive Bayesian classifiers were used to evaluate the proposed method. Twenty UCI Machine Learning Repository and Kaggle datasets with numerical values were also used in this study. The empirical results showed that the proposed CCMMW significantly improves the classification performance through support vector machine, logistic regression, and neural network classifiers in most datasets.

show abstract

“…Although fuzzy clustering can effectively deal with highdimensional feature data through feature reduction [7][8][9][10][11][12][13], it is still difficult to process large-scale data, especially streaming data. Previously, in order to realize large-scale data clustering [14][15][16][17][18], Hore et al [19,20] proposed two incremental algorithms, named SPFCM (Single-Pass Fuzzy C-Means) and OFCM (Online Fuzzy C-Means), based on single-pass and online clustering strategies, respectively.…”

Section: Introductionmentioning

confidence: 99%

Incremental Fuzzy Clustering Based on Feature Reduction

Liu

Zhang

Hao

2022

Journal of Electrical and Computer Engineering

View full text Add to dashboard Cite

In the era of big data, more and more datasets are gradually beyond the application scope of traditional clustering algorithms because of their large scale and high dimensions. In order to break through the limitations, incremental mechanism and feature reduction have become two indispensable parts of current clustering algorithms. Combined with single-pass and online incremental strategies, respectively, we propose two incremental fuzzy clustering algorithms based on feature reduction. The first uses the Weighted Feature Reduction Fuzzy C-Means (WFRFCM) clustering algorithm to process each chunk in turn and combines the clustering results of the previous chunk into the latter chunk for common calculation. The second uses the WFRFCM algorithm for each chunk to cluster at the same time, and the clustering results of each chunk are combined and calculated again. In order to investigate the clustering performance of these two algorithms, six datasets were selected for comparative experiments. Experimental results showed that these two algorithms could select high-quality features based on feature reduction and process large-scale data by introducing the incremental strategy. The combination of the two phases can not only ensure the clustering efficiency but also keep higher clustering accuracy.

show abstract

Feature Linkage Weight Based Feature Reduction using Fuzzy Clustering Method

Cited by 3 publications

References 16 publications

Collaborative possibilistic fuzzy clustering based on information bottleneck

Collaborative possibilistic fuzzy clustering based on information bottleneck

A Novel Approach for Data Feature Weighting Using Correlation Coefficients and Min–Max Normalization

Incremental Fuzzy Clustering Based on Feature Reduction

Contact Info

Product

Resources

About