One of the most important ecosystems in the Amazon rainforest is the Mauritia flexuosa swamp or “aguajal”. However, deforestation of its dominant species, the Mauritia flexuosa palm, also known as “aguaje”, is a common issue, and conservation is poorly monitored because of the difficult access to these swamps. The contribution of this paper is twofold: the presentation of a dataset called MauFlex, and the proposal of a segmentation and measurement method for areas covered in Mauritia flexuosa palms using high-resolution aerial images acquired by UAVs. The method performs a semantic segmentation of Mauritia flexuosa using an end-to-end trainable Convolutional Neural Network (CNN) based on the Deeplab v3+ architecture. Images were acquired under different environment and light conditions using three different RGB cameras. The MauFlex dataset was created from these images and it consists of 25,248 image patches of 512 × 512 pixels and their respective ground truth masks. The results over the test set achieved an accuracy of 98.143%, specificity of 96.599%, and sensitivity of 95.556%. It is shown that our method is able not only to detect full-grown isolated Mauritia flexuosa palms, but also young palms or palms partially covered by other types of vegetation.
Hyperspectral imaging systems are becoming widely used due to their increasing accessibility and their ability to provide detailed spectral responses based on hundreds of spectral bands. However, the resulting hyperspectral images (HSIs) come at the cost of increased storage requirements, increased computational time to process, and highly redundant data. Thus, dimensionality reduction techniques are necessary to decrease the number of spectral bands while retaining the most useful information. Our contribution is two-fold: First, we propose a filter-based method called interband redundancy analysis (IBRA) based on a collinearity analysis between a band and its neighbors. This analysis helps to remove redundant bands and dramatically reduces the search space. Second, we apply a wrapper-based approach called greedy spectral selection (GSS) to the results of IBRA to select bands based on their information entropy values and train a compact convolutional neural network to evaluate the performance of the current selection. We also propose a feature extraction framework that consists of two main steps: first, it reduces the total number of bands using IBRA; then, it can use any feature extraction method to obtain the desired number of feature channels. We present classification results obtained from our methods and compare them to other dimensionality reduction methods on three hyperspectral image datasets. Additionally, we used the original hyperspectral data cube to simulate the process of using actual filters in a multispectral imager.
Segmenting clouds in high-resolution satellite images is an arduous and challenging task due to the many types of geographies and clouds a satellite can capture. Therefore, it needs to be automated and optimized, specially for those who regularly process great amounts of satellite images, such as governmental institutions. In that sense, the contribution of this work is twofold: We present the CloudPeru2 dataset, consisting of 22,400 images of 512 × 512 pixels and their respective hand-drawn cloud masks, as well as the proposal of an end-to-end segmentation method for clouds using a Convolutional Neural Network (CNN) based on the Deeplab v3+ architecture. The results over the test set achieved an accuracy of 96.62%, precision of 96.46%, specificity of 98.53%, and sensitivity of 96.72% which is superior to the compared methods. 1
Video surveillance through security cameras has become difficult due to the fact that many systems require manual human inspection for identifying violent or suspicious scenarios, which is practically inefficient. Therefore, the contribution of this paper is twofold: the presentation of a video dataset called UNI-Crime, and the proposal of a violent robbery detection method in CCTV videos using a deep-learning sequence model. Each of the 30 frames of our videos passes through a pre-trained VGG-16 feature extractor; then, all the sequence of features is processed by two convolutional long-short term memory (con-vLSTM) layers; finally, the last hidden state passes through a series of fully-connected layers in order to obtain a single classification result. The method is able to detect a variety of violent robberies (i.e., armed robberies involving firearms or knives, or robberies showing different level of aggressiveness) with an accuracy of 96.69%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.