2012
DOI: 10.1007/978-3-642-35289-8_34
|View full text |Cite
|
Sign up to set email alerts
|

Deep Learning via Semi-supervised Embedding

Abstract: We show how nonlinear embedding algorithms popular for use with shallow semisupervised learning techniques such as kernel methods can be applied to deep multilayer architectures, either as a regularizer at the output layer, or on each layer of the architecture. This provides a simple alternative to existing approaches to deep learning whilst yielding competitive error rates compared to those methods, and existing shallow semi-supervised techniques.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

5
452
0
1

Year Published

2014
2014
2024
2024

Publication Types

Select...
6
4

Relationship

0
10

Authors

Journals

citations
Cited by 545 publications
(469 citation statements)
references
References 18 publications
5
452
0
1
Order By: Relevance
“…sequence data, with complex dependencies between data nodes. Weston et al [34] propose a TDNN [35] method for NLP tasks, where there are sequence data involved. In the work of [36], the authors apply CNN for object recognition task in video.…”
Section: Related Workmentioning
confidence: 99%
“…sequence data, with complex dependencies between data nodes. Weston et al [34] propose a TDNN [35] method for NLP tasks, where there are sequence data involved. In the work of [36], the authors apply CNN for object recognition task in video.…”
Section: Related Workmentioning
confidence: 99%
“…Although most recent breakthroughs have been achieved with applications of supervised learning, the potential added value of unsupervised learning is so important that it is worthwhile exploring a large array of approaches. The main appeal of unsupervised learning is mostly that it is a crucial ingredient in semi-supervised learning (Weston, Ratle, & Collobert, 2008): there are many more data sources that are unlabeled than data sources that are labeled, and the volume of the unlabeled ones can be considerably larger. Similarly, better unsupervised representation learning has already been shown its advantage as a regularizer (Bengio, Lamblin, Popovici, & Larochelle, 2007;Erhan et al, 2010;Hinton, Osindero, & Teh, 2006;Le et al, 2012;Lee, Ekanadham, & Ng, 2008;Lee, Grosse, Ranganath, & Ng, 2009a;Raina, Battle, Lee, Packer, & Ng, 2007;Ranzato, Poultney, Chopra, & LeCun, 2007) and in the context of transfer learning, e.g., winning two transfer learning competitions in 2011 Mesnil et al, 2011), and domain adaptation (Glorot, Bordes, & Bengio, 2011b).…”
Section: Unsupervised Learningmentioning
confidence: 99%
“…The learning of p(X) can be performed either independently [20,21] or jointly [22,23] to the learning of p(y|X). These approaches have shown to be efficient on numerous problems such as natural language processing [24], speech recognition [25] or handwriting recognition [26].…”
Section: Global Image Labeling Approachesmentioning
confidence: 99%