2011
DOI: 10.1109/tkde.2010.158
|View full text |Cite
|
Sign up to set email alerts
|

When Does Cotraining Work in Real Data?

Abstract: Abstract-Co-training, a paradigm of semi-supervised learning, is promised to alleviate effectively the shortage of labeled examples in supervised learning. The standard two-view co-training requires the dataset to be described by two views of features, and previous studies have shown that co-training works well if the two views satisfy the sufficiency and independence assumptions. In practice, however, these two assumptions are often not known or ensured (even when the two views are given). More commonly, most… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
39
0
1

Year Published

2012
2012
2021
2021

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 76 publications
(40 citation statements)
references
References 13 publications
0
39
0
1
Order By: Relevance
“…Nevertheless, this method has become an example for recent models thanks to the idea of using the agreement (or disagreement) of multiple classifiers and the mutual teaching approach. A good study of when co-training works can be found in [32].…”
Section: B Self-labeled Techniques: Previous Workmentioning
confidence: 99%
“…Nevertheless, this method has become an example for recent models thanks to the idea of using the agreement (or disagreement) of multiple classifiers and the mutual teaching approach. A good study of when co-training works can be found in [32].…”
Section: B Self-labeled Techniques: Previous Workmentioning
confidence: 99%
“…Some promising results have been achieved in this field [3][4][5][6][7], but this proved to be a difficult task, as the relation between the characteristics of the views and the performance of cotraining has not been sufficiently understood. Moreover, research [4] indicates that given a small training dataset as in real-world situations where co-training is called for, the sufficiency and independence assumptions cannot be reliably verified, making the split methods unreliable and application of co-training uncertain.…”
Section: Related Workmentioning
confidence: 99%
“…In addition, we performed experiments on 14 binary and 8 multi-class UCI datasets also previously used for evaluating co-training [4,8,13]. The benchmark datasets of various properties were selected to give us a better insight of how effective our method is on datasets of various dimensionality, size and redundancy.…”
Section: A Datasets and Configurationmentioning
confidence: 99%
See 1 more Smart Citation
“…It is evident, however, that a random split would not work in most cases. Du et al [8] tried several heuristics for view split and found that all heuristics failed with insufficient labeled data. The necessary condition of co-training given in [24] suggested that among all potential view splits, the one which enables the most unlabeled instances connect with labeled examples in the combinative graph is preferred; this was empirically verified in [24] and might give inspiration to develop sound practical view split approaches.…”
Section: About the Viewsmentioning
confidence: 99%