Marco Körner scite author profile

Earth observation (EO) sensors deliver data at daily or weekly intervals. Most land use and land cover classification (LULC) approaches, however, are designed for cloud-free and mono-temporal observations. The increasing temporal capabilities of today's sensors enable the use of temporal, along with spectral and spatial features.Domains such as speech recognition or neural machine translation, work with inherently temporal data and, today, achieve impressive results by using sequential encoder-decoder structures. Inspired by these sequence-to-sequence models, we adapt an encoder structure with convolutional recurrent layers in order to approximate a phenological model for vegetation classes based on a temporal sequence of Sentinel 2 (S2) images. In our experiments, we visualize internal activations over a sequence of cloudy and non-cloudy images and find several recurrent cells that reduce the input activity for cloudy observations. Hence, we assume that our network has learned cloud-filtering schemes solely from input data, which could alleviate the need for tedious cloud-filtering as a preprocessing step for many EO approaches. Moreover, using unfiltered temporal series of top-of-atmosphere (TOA) reflectance data, our experiments achieved state-of-the-art classification accuracies on a large number of crop classes with minimal preprocessing, compared to other classification approaches.

show abstract

Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery

Azimi

et al. 2019

View full text Add to dashboard Cite

0000−0002−6084−2272] , Eleonora Vig 1[0000−0002−7015−6874] , Reza Bahmanyar 1[0000−0002−6999−714X] , Marco Körner 2[0000−0002−9186−4175] , and Peter Reinartz 1[0000−0002−8122−1475]Abstract. Automatic multi-class object detection in remote sensing images in unconstrained scenarios is of high interest for several applications including traffic monitoring and disaster management. The huge variation in object scale, orientation, category, and complex backgrounds, as well as the different camera sensors pose great challenges for current algorithms. In this work, we propose a new method consisting of a novel joint image cascade and feature pyramid network with multi-size convolution kernels to extract multi-scale strong and weak semantic features. These features are fed into rotation-based region proposal and region of interest networks to produce object detections. Finally, rotational non-maximum suppression is applied to remove redundant detections. During training, we minimize joint horizontal and oriented bounding box loss functions, as well as a novel loss that enforces oriented boxes to be rectangular. Our method achieves 68.16% mAP on horizontal and 72.45% mAP on oriented bounding box detection tasks on the challenging DOTA dataset, outperforming all published methods by a large margin (+6% and +12% absolute improvement, respectively). Furthermore, it generalizes to two other datasets, NWPU VHR-10 and UCAS-AOD, and achieves competitive results with the baselines even when trained on DOTA. Our method can be deployed in multi-class object detection applications, regardless of the image and object scales and orientations, making it a great choice for unconstrained aerial and satellite imagery.

show abstract

Building instance classification using street view images

Kang

Körner

Wang

et al. 2018

ISPRS Journal of Photogrammetry and Remote Sensing

239

120

View full text Add to dashboard Cite

This is the pre-print version, to read the final version please go to ISPRS Journal of Photogrammetry and Remote Sensing, Elsevier. (https://doi.org/DOI: 10.1016/j.isprsjprs.2018.02.006). Land-use classification based on spaceborne or aerial remote sensing images has been extensively studied over the past decades. Such classification is usually a patch-wise or pixel-wise labeling over the whole image. But for many applications, such as urban population density mapping or urban utility planning, a classification map based on individual buildings is much more informative. However, such semantic classification still poses some fundamental challenges, for example, how to retrieve fine boundaries of individual buildings. In this paper, we proposed a general framework for classifying the functionality of individual buildings. The proposed method is based on Convolutional Neural Networks (CNNs) which classify façade structures from street view images, such as Google StreetView, in addition to remote sensing images which usually only show roof structures. Geographic information was utilized to mask out individual buildings, and to associate the corresponding street view images. We created a benchmark dataset which was used for training and evaluating CNNs. In addition, the method was applied to generate building classification maps on both region and city scales of several cities in Canada and the US.

show abstract

Temporal Vegetation Modelling Using Long Short-Term Memory Networks for Crop Identification from Medium-Resolution Multi-spectral Satellite Images

2017

View full text Add to dashboard Cite

Self-attention for raw optical Satellite Time Series Classification

Rußwurm

Körner

2020

ISPRS Journal of Photogrammetry and Remote Sensing

179

View full text Add to dashboard Cite

Evaluation of CNN-Based Single-Image Depth Estimation Methods

Koch

Lukas

Fraundorfer

et al. 2019

View full text Add to dashboard Cite

While an increasing interest in deep models for single-image depth estimation (SIDE) can be observed, established schemes for their evaluation are still limited. We propose a set of novel quality criteria, allowing for a more detailed analysis by focusing on specific characteristics of depth maps. In particular, we address the preservation of edges and planar regions, depth consistency, and absolute distance accuracy. In order to employ these metrics to evaluate and compare state-of-the-art SIDE approaches, we provide a new high-quality RGB-D dataset. We used a digital single-lens reflex (DSLR) camera together with a laser scanner to acquire high-resolution images and highly accurate depth maps. Experimental results show the validity of our proposed evaluation protocol.

show abstract

Building Footprint Extraction From VHR Remote Sensing Images Combined With Normalized DSMs Using Fused Fully Convolutional Networks

Bittner

Adam

Cui

et al. 2018

IEEE J. Sel. Top. Appl. Earth Observations Remote Sensing

112

View full text Add to dashboard Cite

Automatic building extraction and delineation from high-resolution satellite imagery is an important but very challenging task, due to the extremely large diversity of building appearances. Nowadays, it is possible to use multiple high-resolution remote sensing data sources which allow the integration of different information in order to improve the extraction accuracy of building outlines. Many algorithms are built on spectral-based or appearance-based criteria, from single or fused data sources, to perform the building footprint extraction. But the features for these algorithms are usually manually extracted, which limits their accuracy. Recently developed fully convolutional networks (FCNs), which are similar to normal convolutional neural networks (CNNs), but the last fully connected layer is replaced by another convolution layer with a large "receptive field", quickly became the state-of-the-art method for image recognition tasks, as they bring the possibility to perform dense pixel-wise classification of input images. Based on these advantages, i.e., the automatic extraction of relevant features, and dense classification of images, we propose an end-to-end fully convolutional network (FCN) which effectively combines the spectral and height information from different data sources and automatically generates a full resolution binary building mask. Our architecture (FUSED-FCN4S) consists of three parallel networks merged at a late stage, which helps propagating fine detailed information from earlier layers to higher-levels, in order to produce an output with more accurate building outlines. The inputs to the proposed Fused-FCN4s are three-band (RGB), panchromatic (PAN), and normalized digital surface model (nDSM) images. Experimental results demonstrate that the fusion of several networks is able to achieve excellent results on complex data. Moreover, the developed model was successfully applied to different cities to show its generalization capacity.

show abstract

Single-Image Super Resolution for Multispectral Remote Sensing Data Using Convolutional Neural Networks

Liebel¹,

Körner²

2016

Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci.

View full text Add to dashboard Cite

ABSTRACT:In optical remote sensing, spatial resolution of images is crucial for numerous applications. Space-borne systems are most likely to be affected by a lack of spatial resolution, due to their natural disadvantage of a large distance between the sensor and the sensed object. Thus, methods for single-image super resolution are desirable to exceed the limits of the sensor. Apart from assisting visual inspection of datasets, post-processing operations-e.g., segmentation or feature extraction-can benefit from detailed and distinguishable structures. In this paper, we show that recently introduced state-of-the-art approaches for single-image super resolution of conventional photographs, making use of deep learning techniques, such as convolutional neural networks (CNN), can successfully be applied to remote sensing data. With a huge amount of training data available, end-to-end learning is reasonably easy to apply and can achieve results unattainable using conventional handcrafted algorithms. We trained our CNN on a specifically designed, domain-specific dataset, in order to take into account the special characteristics of multispectral remote sensing data. This dataset consists of publicly available SENTINEL-2 images featuring 13 spectral bands, a ground resolution of up to 10 m, and a high radiometric resolution and thus satisfying our requirements in terms of quality and quantity. In experiments, we obtained results superior compared to competing approaches trained on generic image sets, which failed to reasonably scale satellite images with a high radiometric resolution, as well as conventional interpolation methods.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Marco Körner

Multi-Temporal Land Cover Classification with Sequential Recurrent Encoders

Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery

Building instance classification using street view images

Temporal Vegetation Modelling Using Long Short-Term Memory Networks for Crop Identification from Medium-Resolution Multi-spectral Satellite Images

Self-attention for raw optical Satellite Time Series Classification

Evaluation of CNN-Based Single-Image Depth Estimation Methods

Building Footprint Extraction From VHR Remote Sensing Images Combined With Normalized DSMs Using Fused Fully Convolutional Networks

Single-Image Super Resolution for Multispectral Remote Sensing Data Using Convolutional Neural Networks

Contact Info

Product

Resources

About