Classification of very high resolution (VHR) remote sensing imagery is a rapidly emerging discipline but faces several challenges owing to the huge scale of the pixel data involved, indiscernibility in the traditionally used features to represent various regions, and the lack of available ground truth data. This paper provides a framework which elegantly overcomes these hurdles by providing a novel semi-supervised learning approach which employs multiscale superpixel tessellation representations of VHR imagery. Superpixels are homogeneous and irregularly shaped regions which form the backbone of our approach and are used to derive novel features by learning a decision tree. Our semi-supervised learning approach works on a superpixel graph and seamlessly combines the large margin capability of a support vector machine (SVM) with a graph based Laplacian label propagation approach to obtain a novel objective function. Further we also provide a self-contained and easily parallelizable linear iterative optimization approach based on the principle of majorization-minimization. We evaluate this approach on four different geographic settings with varying neighborhood types and draw comparisons with the popular and widely used Gaussian Multiple Instance Learning algorithm. Our results showcase several advantages in accuracy and efficiency, which coupled with the ease of model building and inherently parallelizable optimization make our framework a great choice for deployment in large scale applications like global human settlement mapping and population distribution, and change detection.