Supervised Enhanced Soft Subspace Clustering (SESSC) for TSK Fuzzy Classifiers
Yuqi Cui,
Huidong Wang,
Dongrui Wu
Abstract:Fuzzy c-means based clustering algorithms are frequently used for Takagi-Sugeno-Kang (TSK) fuzzy classifier antecedent parameter estimation. One rule is initialized from each cluster. However, most of these clustering algorithms are unsupervised, which waste valuable label information in the training data. This paper proposes a supervised enhanced soft subspace clustering (SESSC) algorithm, which considers simultaneously the within-cluster compactness, between-cluster separation, and label information in clust… Show more
“…As indicated by (7) and (8), increasing the scale of σ also increases the value of Z r to avoid saturation. Similar tricks have already been used for training TSK models with fuzzy clustering algorithms, such as FCM [14], ESSC [16] and SESSC [6]. The parameter σ is computed by:…”
Section: Enhance the Performance Of Tsk Fuzzy Systems On High-dimensi...mentioning
confidence: 99%
“…Fuzzy clustering [4]- [6] and evolutionary algorithms [7], [8] have been used to determine the parameters of TSK fuzzy systems on small datasets. However, their computational cost may be too high for big data.…”
Section: Introductionmentioning
confidence: 99%
“…Traditional optimization algorithms for TSK fuzzy systems use grid partition to partition the input space into different fuzzy regions, whose number grows exponentially with the input dimensionality. A more popular and flexible way is clustering-based partition, e.g., fuzzy c-means (FCM) [14], EWFCM [15], ESSC [4], [16] and SESSC [6], in which the fuzzy sets in different rules are independent and optimized separately.…”
Takagi-Sugeno-Kang (TSK) fuzzy system with Gaussian membership functions (MFs) is one of the most widely used fuzzy systems in machine learning. However, it usually has difficulty handling high-dimensional datasets. This paper explores why TSK fuzzy systems with Gaussian MFs may fail on high-dimensional inputs. After transforming defuzzification to an equivalent form of softmax function, we find that the poor performance is due to the saturation of softmax. We show that two defuzzification operations, LogTSK and HTSK, the latter of which is first proposed in this paper, can avoid the saturation. Experimental results on datasets with various dimensionalities validated our analysis and demonstrated the effectiveness of LogTSK and HTSK.
“…As indicated by (7) and (8), increasing the scale of σ also increases the value of Z r to avoid saturation. Similar tricks have already been used for training TSK models with fuzzy clustering algorithms, such as FCM [14], ESSC [16] and SESSC [6]. The parameter σ is computed by:…”
Section: Enhance the Performance Of Tsk Fuzzy Systems On High-dimensi...mentioning
confidence: 99%
“…Fuzzy clustering [4]- [6] and evolutionary algorithms [7], [8] have been used to determine the parameters of TSK fuzzy systems on small datasets. However, their computational cost may be too high for big data.…”
Section: Introductionmentioning
confidence: 99%
“…Traditional optimization algorithms for TSK fuzzy systems use grid partition to partition the input space into different fuzzy regions, whose number grows exponentially with the input dimensionality. A more popular and flexible way is clustering-based partition, e.g., fuzzy c-means (FCM) [14], EWFCM [15], ESSC [4], [16] and SESSC [6], in which the fuzzy sets in different rules are independent and optimized separately.…”
Takagi-Sugeno-Kang (TSK) fuzzy system with Gaussian membership functions (MFs) is one of the most widely used fuzzy systems in machine learning. However, it usually has difficulty handling high-dimensional datasets. This paper explores why TSK fuzzy systems with Gaussian MFs may fail on high-dimensional inputs. After transforming defuzzification to an equivalent form of softmax function, we find that the poor performance is due to the saturation of softmax. We show that two defuzzification operations, LogTSK and HTSK, the latter of which is first proposed in this paper, can avoid the saturation. Experimental results on datasets with various dimensionalities validated our analysis and demonstrated the effectiveness of LogTSK and HTSK.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.