Semi-supervised learning in knowledge discovery

Klose, Aljoscha; Kruse, Rudolf

doi:10.1016/j.fss.2004.07.016

Cited by 19 publications

(9 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These methods may be particularly interesting when dealing with the difficulty to interpret large data sets, where manual interpretation and labeling would be of high cost. Semi-supervised learning refers to the use of labeled and unlabeled data within the learning process [18,35].…”

Section: Fuzzy Clustering: Original Fuzzy C-means and Semi-supervisedmentioning

confidence: 99%

A proposal for regime change/duration classification in chaotic systems

Lopes

Sumida²,

Camargo

et al. 2015

Proceedings of the 2015 Conference of the International Fuzzy Systems Association and the European Society for Fuzzy Logic and

View full text Add to dashboard Cite

In order to to predict regime duration in a given chaotic system, for a set of output prototypes are available, we propose to use a clustering technique for the definition of classes of regime duration, which are then used by a chosen classifier. In this way, the exact boundaries between classes are allowed to emerge from the data, as long as prototypical values fall in distinct classes. We investigate the use of both unsupervised and semi-supervised fuzzy clustering techniques FCM and ssFCM, as well as the traditional k-Means technique. To classify the data, we use neuro-fuzzy system ANFIS and two decision trees (J48 and NBTree). We apply the procedure on the well-known Lorenz strange attractor, having bred vector counts as input variables.

show abstract

Section: Fuzzy Clustering: Original Fuzzy C-means and Semi-supervisedmentioning

confidence: 99%

A proposal for regime change/duration classification in chaotic systems

Lopes

Sumida²,

Camargo

et al. 2015

Proceedings of the 2015 Conference of the International Fuzzy Systems Association and the European Society for Fuzzy Logic and

View full text Add to dashboard Cite

show abstract

“…Semisupervised learning method has advantage of both supervised learning and unsupervised learning, it becomes researching hotspot and has been applied in different areas [5][6]. For the drawback of FCM algorithm, considering the actual situation of intrusion detection system ,using fuzzy clustering with the supervised information, initializing cluster with labeled known data, then improving clustering process by restriction of little known information and a lot of unlabeled data, this is semi-supervised Fuzzy Clustering algorithm [7][8].…”

Section: Intrusion Detection Algorithm Based On Semi -Supervised mentioning

confidence: 99%

Intrusion Detection Algorithm Based on Semi-supervised Learning

Wang

2011

2011 International Conference of Information Technology, Computer Engineering and Management Sciences

View full text Add to dashboard Cite

In order to overcome the shortage that intrusion detection system is sensitive to outlier, we propose an intrusion detection algorithm based on semi-supervised fuzzy clustering. In this algorithm, the training data for semi-supervised learning is a hybrid data of labeled and unlabeled samples. While training the system model, we use a few labeled samples and many unlabeled samples as seeds initializing the classifier of the system. Under the constraint of labeled data, we use fuzzy C-Means method to generate clusters without many labeled data and uneasily plunges locally optima. Comparing with FCM algorithm, the experiment results on data sets KDD CUP 99 has shown the effectiveness of the proposed algorithm, it has higher detection rate and lower false detection rate.

show abstract

“…In cases where the required amount of labelled samples cannot be provided, the learning system commonly fails. On the other hand, in unsupervised learning, the result strongly depends on prior assumptions and appropriate choice of, e.g., distance measure, distribution function, and expected number of classes/clusters [3]. The disadvantages of supervised and unsupervised learning lead researchers to semisupervised learning which is actually the half way between the supervised and unsupervised approaches.…”

Section: Introductionmentioning

confidence: 99%

Enlarging multiword expression dataset by co-training

Metin¹

2018

Turk J Elec Eng & Comp Sci

View full text Add to dashboard Cite

In multiword expressions (MWEs), multiple words unite to build a new unit in language. When MWE identification is accepted as a binary classification task, one of the most important factors in performance is to train the classifier with enough number of labelled samples. Since manual labelling is a time-consuming task, the performances of MWE recognition studies are limited with the size of the training sets. In this study, we propose the comparison-based and common-decision co-training approaches in order to enlarge the MWE dataset. In the experiments, the performances of the proposed approaches were compared to those of the standard co-training [1] and manual labelling where statistical and linguistic features are employed as two different views of the MWE dataset [2]. A number of tests with different settings were performed on a Turkish MWE dataset. Ten different classifiers were utilized in the experiments and the best performing classifier pair was observed to be the SMO-SMO pair. The experimental results showed that the common-decision co-training approach is an alternative to hand-labeling of large MWE datasets and both newly proposed approaches outperform the standard co-training [2] when the training set is to be enlarged in MWE classification.

show abstract

Semi-supervised learning in knowledge discovery

Cited by 19 publications

References 21 publications

A proposal for regime change/duration classification in chaotic systems

A proposal for regime change/duration classification in chaotic systems

Intrusion Detection Algorithm Based on Semi-supervised Learning

Enlarging multiword expression dataset by co-training

Contact Info

Product

Resources

About