Self-taught learning

Raina, Rajat; Battle, Alexis; Lee, Honglak; Packer, Benjamin; Ng, Andrew Y.

doi:10.1145/1273496.1273592

Cited by 1,122 publications

(100 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…More specifically, a dictionary could be trained using both sets of labelled and unlabelled data. Related to this is the use of self-taught learning [12] also known as transfer learning from unlabelled data. One could train a large dictionary on a corpus that have come from a different distribution than the target dataset intended for classification.…”

Section: Discussionmentioning

confidence: 99%

Representation Learning for Sparse, High Dimensional Multi-label Classification

Kiros

Soto

Milios

et al. 2012

Rough Sets and Current Trends in Computing

View full text Add to dashboard Cite

Abstract. In this article we describe the approach we applied for the JRS 2012 Data Mining Competition. The task of the competition was the multi-labelled classification of biomedical documents. Our method is motivated by recent work in the machine learning and computer vision communities that highlights the usefulness of feature learning for classification tasks. Our approach uses orthogonal matching persuit to learn a dictionary from PCA-transformed features. Binary relevance with logistic regression is applied to the encoded representations, leading to a fifth place performance in the competition. In order to show the suitability of our approach outside the competition task we also report a state-of-theart classification performance on the multi-label ASRS dataset.

show abstract

Section: Discussionmentioning

confidence: 99%

Representation Learning for Sparse, High Dimensional Multi-label Classification

Kiros

Soto

Milios

et al. 2012

Rough Sets and Current Trends in Computing

View full text Add to dashboard Cite

show abstract

“…For target domains with limited example capacities, feature representation seems critical. Ideally, the target domain dataset can well inherit the representational structure transferred from the model trained on the source domain dataset [48,49].…”

Section:  Cross-domain Transfermentioning

confidence: 99%

Addressing Complexities of Machine Learning in Big Data: Principles, Trends and Challenges from Systematical Perspectives

Q¹,

Zhao²,

Huang³

et al. 2017

Preprint

View full text Add to dashboard Cite

The concept of 'big data' has been widely discussed, and its value has been illuminated throughout a variety of domains. To quickly mine potential values and alleviate the ever-increasing volume of information, machine learning is playing an increasingly important role and faces more challenges than ever. Because few studies exist regarding how to modify machine learning techniques to accommodate big data environments, we provide a comprehensive overview of the history of the evolution of big data, the foundations of machine learning, and the bottlenecks and trends of machine learning in the big data era. More specifically, based on learning principals, we discuss regularization to enhance generalization. The challenges of quality in big data are reduced to the curse of dimensionality, class imbalances, concept drift and label noise, and the underlying reasons and mainstream methodologies to address these challenges are introduced. Learning model development has been driven by domain specifics, dataset complexities, and the presence or absence of human involvement. In this paper, we propose a robust learning paradigm by aggregating the aforementioned factors. Over the next few decades, we believe that these perspectives will lead to novel ideas and encourage more studies aimed at incorporating knowledge and establishing data-driven learning systems that involve both data quality considerations and human interactions.

show abstract

“…This approach is proposed to be complemented as follows. Weights of the j -th hidden layer after pretraining with auto-encoder can be refined by training network with one hidden layer on the original or a similar problem, using transfer learning concept [7][8][9]. That means, we have to use the weights W , received on the second paragraph of the j -th step of the algorithm for initialization of the network with one hidden layer, an and an output layer size of m , and train the network on , 1 { , }, =1…”

Section: Algorithm Descriptionmentioning

confidence: 99%

Hybrid pre training algorithm of Deep Neural Networks

Drokin

2016

ITM Web of Conferences

View full text Add to dashboard Cite

Abstract. This paper proposes a hybrid algorithm of pre training deep networks, using both marked and unmarked data. The algorithm combines and extends the ideas of Self-Taught learning and pre training of neural networks approaches on the one hand, as well as supervised learning and transfer learning on the other. Thus, the algorithm tries to integrate in itself the advantages of each approach. The article gives some examples of applying of the algorithm, as well as its comparison with the classical approach to pre training of neural networks. These examples show the effectiveness of the proposed algorithm.

show abstract

Self-taught learning

Cited by 1,122 publications

References 21 publications

Representation Learning for Sparse, High Dimensional Multi-label Classification

Representation Learning for Sparse, High Dimensional Multi-label Classification

Addressing Complexities of Machine Learning in Big Data: Principles, Trends and Challenges from Systematical Perspectives

Hybrid pre training algorithm of Deep Neural Networks

Contact Info

Product

Resources

About