We study semi-supervised learning when the data consists of multiple intersecting manifolds. We give a finite sample analysis to quantify the potential gain of using unlabeled data in this multi-manifold setting. We then propose a semi-supervised learning algorithm that separates different manifolds into decision sets, and performs supervised learning within each set. Our algorithm involves a novel application of Hellinger distance and size-constrained spectral clustering. Experiments demonstrate the benefit of our multimanifold semi-supervised learning approach.
An embodied approach to reading comprehension suggests that emerging readers must learn to map words and phrases onto their remembered experiences, but this is made difficult by the necessity of focusing attention on decoding. Having children manipulate toys to correspond to what they are reading overcomes this problem, but introduces its own problem for the classroom, namely having to provide a classroom full of children with manipulative. In this article, we demonstrate that having first-and second-grade children manipulate images of toys on a computer screen benefits their comprehension as much as physical manipulation of the toys. In addition, manipulation on one day facilitates reading in the same domain one week later. These findings encourage the use of manipulation of textrelevant images as an educational technology for enhancing early reading comprehension. The findings also set constraints on theoretical accounts of embodiment while reading.
Abstract. We consider a novel "online semi-supervised learning" setting where (mostly unlabeled) data arrives sequentially in large volume, and it is impractical to store it all before learning. We propose an online manifold regularization algorithm. It differs from standard online learning in that it learns even when the input point is unlabeled. Our algorithm is based on convex programming in kernel space with stochastic gradient descent, and inherits the theoretical guarantees of standard online algorithms. However, naïve implementation of our algorithm does not scale well. This paper focuses on efficient, practical approximations; we discuss two sparse approximations using buffering and online random projection trees. Experiments show our algorithm achieves risk and generalization accuracy comparable to standard batch manifold regularization, while each step runs quickly. Our online semi-supervised learning setting is an interesting direction for further theoretical development, paving the way for semi-supervised learning to work on real-world lifelong learning tasks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.