2004
DOI: 10.1109/tpami.2004.127
|View full text |Cite
|
Sign up to set email alerts
|

Semisupervised learning of classifiers: theory, algorithms, and their application to human-computer interaction

Abstract: Automatic classification is one of the basic tasks required in any pattern recognition and human computer interaction application. In this paper we discuss training probabilistic classifiers with labeled and unlabeled data. We provide a new analysis that shows under what conditions unlabeled data can be used in learning to improve classification performance. We also show that if the conditions are violated, using unlabeled data can be detrimental to classification performance. We discuss the implications of th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
141
0
1

Year Published

2005
2005
2012
2012

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 232 publications
(145 citation statements)
references
References 43 publications
3
141
0
1
Order By: Relevance
“…are not able to match the underlying generative structure [10]. This phenomenon occurs in both evaluation measures.…”
Section: 22mentioning
confidence: 94%
See 3 more Smart Citations
“…are not able to match the underlying generative structure [10]. This phenomenon occurs in both evaluation measures.…”
Section: 22mentioning
confidence: 94%
“…As stated in [10], if the correct structure of the real distribution of the data is obtained, unlabelled data improve the classifier, otherwise, unlabelled data can actually degrade performance. For this reason, it seems more appropriate to perform a structural search in order to find the real model.…”
Section: Learning Multi-dimensional Bayesian Network Classifiers In Tmentioning
confidence: 99%
See 2 more Smart Citations
“…Some studies concluded that significant improvements in classification performance can be achieved when unlabeled examples are used, while others have indicated otherwise [4][10] [12] [25]. Blum and Mitchell [4] and Cozman et al [10] suggested that unlabeled data can help to reduce variance of the estimator as long as the modeling assumptions match the ground truth data. Otherwise, unlabeled data may either improve or degrade the classification performance, depending on the complexity of the classifier compared to the training set size [12].…”
Section: Related Workmentioning
confidence: 99%