This paper gives a review of the recent developments in deep learning and unsupervised feature learning for time-series problems. While these techniques have shown promise for modeling static data, such as computer vision, applying them to time-series data is gaining increasing attention. This paper overviews the particular challenges present in time-series data and provides a review of the works that have either applied time-series data to unsupervised feature learning algorithms or alternatively have contributed to modications of feature learning algorithms to take into account the challenges present in time-series data.
Most attempts at training computers for the difficult and time-consuming task of sleep stage classification involve a feature extraction step. Due to the complexity of multimodal sleep data, the size of the feature space can grow to the extent that it is also necessary to include a feature selection step. In this paper, we propose the use of an unsupervised feature learning architecture called deep belief nets (DBNs) and show how to apply it to sleep data in order to eliminate the use of handmade features. Using a postprocessing step of hidden Markov model (HMM) to accurately capture sleep stage switching, we compare our results to a feature-based approach. A study of anomaly detection with the application to home environment data collection is also presented. The results using raw data with a deep architecture, such as the DBN, were comparable to a feature-based approach when validated on clinical datasets.
Abstract:The availability of high-resolution remote sensing (HRRS) data has opened up the possibility for new interesting applications, such as per-pixel classification of individual objects in greater detail. This paper shows how a convolutional neural network (CNN) can be applied to multispectral orthoimagery and a digital surface model (DSM) of a small city for a full, fast and accurate per-pixel classification. The predicted low-level pixel classes are then used to improve the high-level segmentation. Various design choices of the CNN architecture are evaluated and analyzed. The investigated land area is fully manually labeled into five categories (vegetation, ground, roads, buildings and water), and the classification accuracy is compared to other per-pixel classification works on other land areas that have a similar choice of categories. The results of the full classification and segmentation on selected segments of the map show that CNNs are a viable tool for solving both the segmentation and object recognition task for remote sensing data.
Computed tomography (CT) is the method of choice for diagnosing ureteral stones - kidney stones that obstruct the ureter. The purpose of this study is to develop a computer aided detection (CAD) algorithm for identifying a ureteral stone in thin slice CT volumes. The challenge in CAD for urinary stones lies in the similarity in shape and intensity of stones with non-stone structures and how to efficiently deal with large high-resolution CT volumes. We address these challenges by using a Convolutional Neural Network (CNN) that works directly on the high resolution CT volumes. The method is evaluated on a large data base of 465 clinically acquired high-resolution CT volumes of the urinary tract with labeling of ureteral stones performed by a radiologist. The best model using 2.5D input data and anatomical information achieved a sensitivity of 100% and an average of 2.68 false-positives per patient on a test set of 88 scans.
This paper investigates a rapid and accurate detection system for spoilage in meat. We use unsupervised feature learning techniques (stacked restricted Boltzmann machines and auto-encoders) that consider only the transient response from undoped zinc oxide, manganese-doped zinc oxide, and fluorine-doped zinc oxide in order to classify three categories: the type of thin film that is used, the type of gas, and the approximate ppm-level of the gas. These models mainly offer the advantage that features are learned from data instead of being hand-designed. We compare our results to a feature-based approach using samples with various ppm level of ethanol and trimethylamine (TMA) that are good markers for meat spoilage. The result is that deep networks give a better and faster classification than the feature-based approach, and we thus conclude that the fine-tuning of our deep models are more efficient for this kind of multi-label classification task.
In this paper, we examine how to ground symbols referring to objects in perceptual data from a robot system by examining object entities and their changes over time. In particular, we approach the challenge by (1) tracking and maintaining object entities over time; and (2) utilizing an artificial neural network to learn the coupling between words referring to actions and movement patterns of tracked object entities. For this purpose, we propose a framework that relies on the notations presented in perceptual anchoring. We further present a practical extension of the notation such that our framework can track and maintain the history of detected object entities. Our approach is evaluated using everyday objects typically found in a home environment. Our object classification module has the possibility to detect and classify over several 100 object categories. We demonstrate how the framework creates and maintains, both in space and time, representations of objects such as "spoon" and "coffee mug." These representations are later used for training of different sequential learning algorithms to learn movement actions such as "pour" and "stir." We finally exemplify how learned movements actions, combined with commonsense knowledge, further can be used to improve the anchoring process per se.
There is an increasing interest in the machine learning community to automatically learn feature representations directly from the (unlabeled) data instead of using hand-designed features. The autoencoder is one method that can be used for this purpose. However, for data sets with a high degree of noise, a large amount of the representational capacity in the autoencoder is used to minimize the reconstruction error for these noisy inputs. This paper proposes a method that improves the feature learning process by focusing on the task relevant information in the data. This selective attention is achieved by weighting the reconstruction error and reducing the influence of noisy inputs during the learning process. The proposed model is trained on a number of publicly available image data sets and the test error rate is compared to a standard sparse autoencoder and other methods, such as the denoising autoencoder and contractive autoencoder.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.