Extreme learning machine (ELM) is an emerging learning algorithm for the generalized single hidden layer feedforward neural networks, of which the hidden node parameters are randomly generated and the output weights are analytically computed. However, due to its shallow architecture, feature learning using ELM may not be effective for natural signals (e.g., images/videos), even with a large number of hidden nodes. To address this issue, in this paper, a new ELM-based hierarchical learning framework is proposed for multilayer perceptron. The proposed architecture is divided into two main components: 1) self-taught feature extraction followed by supervised feature classification and 2) they are bridged by random initialized hidden weights. The novelties of this paper are as follows: 1) unsupervised multilayer encoding is conducted for feature extraction, and an ELM-based sparse autoencoder is developed via l1 constraint. By doing so, it achieves more compact and meaningful feature representations than the original ELM; 2) by exploiting the advantages of ELM random feature mapping, the hierarchically encoded outputs are randomly projected before final decision making, which leads to a better generalization with faster learning speed; and 3) unlike the greedy layerwise training of deep learning (DL), the hidden layers of the proposed framework are trained in a forward manner. Once the previous layer is established, the weights of the current layer are fixed without fine-tuning. Therefore, it has much better learning efficiency than the DL. Extensive experiments on various widely used classification data sets show that the proposed algorithm achieves better and faster convergence than the existing state-of-the-art hierarchical learning methods. Furthermore, multiple applications in computer vision further confirm the generality and capability of the proposed learning scheme.
Spatiotemporal image fusion is considered as a promising way to provide Earth observations with both high spatial resolution and frequent coverage, and recently, learning-based solutions have been receiving broad attention. However, these algorithms treating spatiotemporal fusion as a single image super-resolution problem, generally suffers from the significant spatial information loss in coarse images, due to the large upscaling factors in real applications. To address this issue, in this paper, we exploit temporal information in fine image sequences and solve the spatiotemporal fusion problem with a two-stream convolutional neural network called StfNet. The novelty of this paper is twofold. First, considering the temporal dependence among image sequences, we incorporate the fine image acquired at the neighboring date to super-resolve the coarse image at the prediction date. In this way, our network predicts a fine image not only from the structural similarity between coarse and fine image pairs but also by exploiting abundant texture information in the available neighboring fine images. Second, instead of estimating each output fine image independently, we consider the temporal relations among time-series images and formulate a temporal constraint. This temporal constraint aiming to guarantee the uniqueness of the fusion result and encourages temporal consistent predictions in learning and thus leads to more realistic final results. We evaluate the performance of the StfNet using two actual data sets of Landsat-Moderate Resolution Imaging Spectroradiometer (MODIS) acquisitions, and both visual and quantitative evaluations demonstrate that our algorithm achieves state-of-the-art performance.
Abstract-This paper presents the result of a recent large-scale subjective study of image retargeting quality on a collection of images generated by several representative image retargeting methods. Owning to many approaches to image retargeting that have been developed, there is a need for a diverse independent public database of the retargeted images and the corresponding subjective scores to be freely available. We build an image retargeting quality database, in which 171 retargeted images (obtained from 57 natural source images of different contents) were created by several representative image retargeting methods. And the perceptual quality of each image is subjectively rated by at least 30 viewers, meanwhile the mean opinion scores (MOS) were obtained. It is revealed that the subject viewers have arrived at a reasonable agreement on the perceptual quality of the retargeted image. Therefore, the MOS values obtained can be regarded as the ground truth for evaluating the quality metric performances. The database is made publicly available (Image Retargeting Subjective Database, [Online]. Available: http://ivp.ee.cuhk.edu.hk/projects/demo/retargeting/index.html) to the research community in order to further research on the perceptual quality assessment of the retargeted images. Moreover, the built image retargeting database is analyzed from the perspectives of the retargeting scale, the retargeting method, and the source image content. We discuss how to retarget the images according to the scale requirement and the source image attribute information. Furthermore, several publicly available quality metrics for the retargeted images are evaluated on the built database. How to develop an effective quality metric for retargeted images is discussed through a specifically designed subjective testing process. It is demonstrated that the metric performance can be further improved, by fusing the descriptors of shape distortion and content information loss.Index Terms-Image quality assessment, image retargeting, objective metric, subjective evaluation.
Extreme learning machine (ELM), as a new learning framework, draws increasing attractions in the areas of large-scale computing, high-speed signal processing, artificial intelligence, and so on. ELM aims to break the barriers between the conventional artificial learning techniques and biological learning mechanism and represents a suite of machine learning techniques in which hidden neurons need not to be tuned. ELM theories and algorithms argue that "random hidden neurons" capture the essence of some brain learning mechanisms as well as the intuitive sense that the efficiency of brain learning need not rely on computing power of neurons. Thus, compared with traditional neural networks and support vector machine, ELM offers significant advantages such as fast learning speed, ease of implementation, and minimal human intervention. Due to its remarkable generalization performance and implementation efficiency, ELM has been applied in various applications. In this paper, we first provide an overview of newly derived ELM theories and approaches. On the other hand, with the ongoing development of multilayer feature representation, some new trends on ELM-based hierarchical learning are discussed. Moreover, we also present several interesting ELM applications to showcase the practical advances on this subject.
Numerous state-of-the-art perceptual image quality assessment (IQA) algorithms share a common two-stage process: distortion description followed by distortion effects pooling. As for the first stage, the distortion descriptors or measurements are expected to be effective representatives of human visual variations, while the second stage should well express the relationship among quality descriptors and the perceptual visual quality. However, most of the existing quality descriptors (e.g., luminance, contrast, and gradient) do not seem to be consistent with human perception, and the effects pooling is often done in ad-hoc ways. In this paper, we propose a novel full-reference IQA metric. It applies non-negative matrix factorization (NMF) to measure image degradations by making use of the parts-based representation of NMF. On the other hand, a new machine learning technique [extreme learning machine (ELM)] is employed to address the limitations of the existing pooling techniques. Compared with neural networks and support vector regression, ELM can achieve higher learning accuracy with faster learning speed. Extensive experimental results demonstrate that the proposed metric has better performance and lower computational complexity in comparison with the relevant state-of-the-art approaches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.