Integrating Geometrical Context for Semantic Labeling of Indoor Scenes using RGBD Images

Khan, Salman; Bennamoun, Mohammed; Sohel, Ferdous; Togneri, Roberto; Naseem, Imran

doi:10.1007/s11263-015-0843-8

Cited by 22 publications

(16 citation statements)

References 49 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…IMLS systems, in addition to the point clouds, provide a continuous trajectory of device locations instead of few discrete station points in TLS. Current methods for indoor reconstruction and semantic labelling use mainly TLSs (Becker et al, 2015;Mura et al, 2014a;Oesau et al, 2014) or RGB-Depth data (Armeni et al, 2016;Khan et al, 2015). If MLS data is used as in (Xiao and Furukawa, 2014), the benefit of trajectory data is not exploited.…”

Section: Introductionmentioning

confidence: 99%

Exploiting Indoor Mobile Laser Scanner Trajectories for Semantic Interpretation of Point Clouds

Nikoohemat

Peter

Elberink

et al. 2017

ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci.

View full text Add to dashboard Cite

ABSTRACT:The use of Indoor Mobile Laser Scanners (IMLS) for data collection in indoor environments has been increasing in the recent years. These systems, unlike Terrestrial Laser Scanners (TLS), collect data along a trajectory instead of at discrete scanner positions. In this research, we propose several methods to exploit the trajectories of IMLS systems for the interpretation of point clouds. By means of occlusion reasoning and use of trajectory as a set of scanner positions, we are capable of detecting openings in cluttered indoor environments. In order to provide information about both the partitioning of the space and the navigable space, we use the voxel concept for point clouds. Furthermore, to reconstruct walls, floor and ceiling we exploit the indoor topology and plane primitives. The results show that the trajectory is a valuable source of data for feature detection and understanding of indoor MLS point clouds.

show abstract

Section: Introductionmentioning

confidence: 99%

Exploiting Indoor Mobile Laser Scanner Trajectories for Semantic Interpretation of Point Clouds

Nikoohemat

Peter

Elberink

et al. 2017

ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci.

View full text Add to dashboard Cite

show abstract

“…Note that the 12-class accuracy of our network is calculated through the model previously trained for 37 classes. Our model substantially outperforms the one from [9] on large planar regions such as those labeled as floors and ceilings. This also results from the incorporated convolutional features and the fused global contexts.…”

Section: Results and Comparisonsmentioning

confidence: 90%

“…Specifically, kernel descriptions based on traditional multi-channel features, such as color, depth gradient, and surface normal, are used as photometric and depth features [24]. A rich feature set containing various traditional features, e.g., SIFT, HOG, LBP and plane orientation, are used as local appearance features and plane appearance features in [9]. HOG features of RGB images and HOG+HH (histogram of height) features of depth images are extracted as representations in [25] for training successive classifiers.…”

Section: Related Workmentioning

confidence: 99%

“…Previous work on scene labeling can be divided into two categories according to their target scenes: indoor and outdoor scenes. Compared with outdoor scene labeling [6,7,8], indoor scene labeling is more challenging due to a larger set of semantic labels, more severe object occlusions, and more diverse object appearances [9]. For example, indoor object classes, such as beds covered with different sheets and various appearances of curtains, are much harder to characterize than outdoor classes, e.g., roads, buildings, and sky, through photometric channels only.…”

Section: Introductionmentioning

confidence: 99%

“…1, according to the global scene layout. To overcome this issue, graphical models, such as a conditional random field [9,11] or a mean-field approximation [15], have been applied to improve prediction results in a post-processing step. These methods, however, separate context modeling from convolutional feature learning, which may give rise to suboptimal results on complex scenes due to less discriminative feature representation [16].…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

LSTM-CF: Unifying Context Modeling and Fusion with LSTMs for RGB-D Scene Labeling

Gan²,

Liang³

et al. 2016

Lecture Notes in Computer Science

146

103

View full text Add to dashboard Cite

Semantic labeling of RGB-D scenes is crucial to many intelligent applications including perceptual robotics. It generates pixelwise and fine-grained label maps from simultaneously sensed photometric (RGB) and depth channels. This paper addresses this problem by i) developing a novel Long Short-Term Memorized Context Fusion (LSTM-CF) Model that captures and fuses contextual information from multiple channels of photometric and depth data, and ii) incorporating this model into deep convolutional neural networks (CNNs) for end-to-end training. Specifically, contexts in photometric and depth channels are, respectively, captured by stacking several convolutional layers and a long short-term memory layer; the memory layer encodes both short-range and longrange spatial dependencies in an image along the vertical direction. Another long short-term memorized fusion layer is set up to integrate the contexts along the vertical direction from different channels, and perform bi-directional propagation of the fused vertical contexts along the horizontal direction to obtain true 2D global contexts. At last, the fused contextual representation is concatenated with the convolutional features extracted from the photometric channels in order to improve the accuracy of fine-scale semantic labeling. Our proposed model has set a new state of the art, i.e., 48.1% and 49.4% average class accuracy over 37 categories (2.2% and 5.4% improvement) on the large-scale SUNRGBD dataset and the NYUDv2 dataset, respectively.

show abstract

Multimodal Neural Networks: RGB-D for Semantic Segmentation and Object Detection

Schneider

Jasch

Fröhlich

et al. 2017

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Integrating Geometrical Context for Semantic Labeling of Indoor Scenes using RGBD Images

Cited by 22 publications

References 49 publications

Exploiting Indoor Mobile Laser Scanner Trajectories for Semantic Interpretation of Point Clouds

Exploiting Indoor Mobile Laser Scanner Trajectories for Semantic Interpretation of Point Clouds

LSTM-CF: Unifying Context Modeling and Fusion with LSTMs for RGB-D Scene Labeling

Multimodal Neural Networks: RGB-D for Semantic Segmentation and Object Detection

Contact Info

Product

Resources

About