DAG-Recurrent Neural Networks for Scene Labeling

Shuai, Bing; Zuo, Zhen; Wang, Bing; Wang, Gang

doi:10.1109/cvpr.2016.394

Cited by 122 publications

(140 citation statements)

References 34 publications

Supporting

Mentioning

139

Contrasting

Unclassified

Order By: Relevance

“…23. http://www.semantic3d.net/ [43] GoogLeNet(FCN) Patchwise CNN, Standalone CRF CRFasRNN [70] FCN-8s CRF reformulated as RNN Dilation [71] VGG-16 Dilated convolutions ENet [72] ENet bottleneck Bottleneck module for efficiency Multi-scale-CNN-Raj [73] VGG-16(FCN) Multi-scale architecture Multi-scale-CNN-Eigen [74] Custom Multi-scale sequential refinement Multi-scale-CNN-Roy [75] Multi-scale-CNN-Eigen Multi-scale coarse-to-fine refinement Multi-scale-CNN-Bian [76] FCN Independently trained multi-scale FCNs ParseNet [77] VGG-16 Global context feature fusion ReSeg [78] VGG-16 + ReNet Extension of ReNet to semantic segmentation LSTM-CF [79] Fast R-CNN + DeepMask Fusion of contextual information from multiple sources 2D-LSTM [80] MDRNN Image context modelling rCNN [81] MDRNN Different input sizes, image context DAG-RNN [82] Elman network Graph image structure for context modelling SDS [10] R-CNN + Box CNN Simultaneous detection and segmentation DeepMask [83] VGG-A Proposals generation for segmentation SharpMask [84] DeepMask Top-down refinement module MultiPathNet [85] Fast R-CNN + DeepMask Multi path information flow through network Huang-3DCNN [86] Own 3DCNN 3DCNN for voxelized point clouds PointNet [87] Own MLP-based Segmentation of unordered point sets Clockwork Convnet [88] FCN Clockwork scheduling for sequences 3DCNN-Zhang…”

Section: Methodsmentioning

confidence: 99%

A survey on deep learning techniques for image and video semantic segmentation

García-García

Orts-Escolano

Oprea

et al. 2018

Applied Soft Computing

1,450

796

View full text Add to dashboard Cite

Image semantic segmentation is more and more being of interest for computer vision and machine learning researchers. Many applications on the rise need accurate and efficient segmentation mechanisms: autonomous driving, indoor navigation, and even virtual or augmented reality systems to name a few. This demand coincides with the rise of deep learning approaches in almost every field or application target related to computer vision, including semantic segmentation or scene understanding. This paper provides a review on deep learning methods for semantic segmentation applied to various application areas. Firstly, we describe the terminology of this field as well as mandatory background concepts. Next, the main datasets and challenges are exposed to help researchers decide which are the ones that best suit their needs and their targets. Then, existing methods are reviewed, highlighting their contributions and their significance in the field. Finally, quantitative results are given for the described methods and the datasets in which they were evaluated, following up with a discussion of the results. At last, we point out a set of promising future works and draw our own conclusions about the state of the art of semantic segmentation using deep learning techniques.

show abstract

Section: Methodsmentioning

confidence: 99%

A survey on deep learning techniques for image and video semantic segmentation

García-García

Orts-Escolano

Oprea

et al. 2018

Applied Soft Computing

1,450

796

View full text Add to dashboard Cite

show abstract

“…For example, the existing RNN models mainly focus on either sequence-structured inputs, such as Long Short-Term Memory (LSTM) [17] and GRU, or tree-structured inputs, such as Tree-LSTM [18]. There are a handful of RNN models that try to model static DAGs designed for different application domains; e.g., DAG-RNN [19], [20] models each 2D image as a DAG for scene labeling, while RNN-LE [21] models each contact map over a protein's amino acids as a DAG for protein structure prediction. However, both DAG-RNN and RNN-LE are based on the plain RNN architecture, and are unable to capture the peculiarities of a diffusion process.…”

Section: Introductionmentioning

confidence: 99%

Topological Recurrent Neural Network for Diffusion Prediction

Wang

Zheng

Liu

et al. 2017

2017 IEEE International Conference on Data Mining (ICDM)

139

124

View full text Add to dashboard Cite

Abstract-In this paper, we study the problem of using representation learning to assist information diffusion prediction on graphs. In particular, we aim at estimating the probability of an inactive node to be activated next in a cascade. Despite the success of recent deep learning methods for diffusion, we find that they often underexplore the cascade structure. We consider a cascade as not merely a sequence of nodes ordered by their activation time stamps; instead, it has a richer structure indicating the diffusion process over the data graph. As a result, we introduce a new data model, namely diffusion topologies, to fully describe the cascade structure. We find it challenging to model diffusion topologies, which are dynamic directed acyclic graphs (DAGs), with the existing neural networks. Therefore, we propose a novel topological recurrent neural network, namely Topo-LSTM, for modeling dynamic DAGs. We customize Topo-LSTM for the diffusion prediction task, and show it improves the state-of-theart baselines, by 20.1%-56.6% (MAP) relatively, across multiple real-world data sets. Our code and data sets are available online 1 .

show abstract

“…h(t) is the hidden state of the t th subsequence and its information will be transformed to the future subsequences through matrix U hh . Here we do not employ an additional softmax layer to normalize the output o into a probability vector as done in other RNN architectures [9], [45]. This is because the softmax layer is unsuitable for our soft regression model as the sum of the elements in the output of softmax operator is 1, while the sum of the elements in α(t)y i is α(t), whose value is within [0, 1].…”

Section: A Soft Rnn (Srnn) Regression Based Early Action Predictionmentioning

confidence: 99%

“…Recurrent Neural Networks (RNN) have been widely used to address the sequential prediction problems in literatures, such as speech recognition [9], human action/activity recognition [34], [41], scene labeling [39], [45], image caption [19], and object segmentation [31]. RNN and its variants LSTM [34], GRNN [4], etc.…”

Section: Related Workmentioning

confidence: 99%

Early Action Prediction by Soft Regression

Zheng

et al. 2019

IEEE Trans. Pattern Anal. Mach. Intell.

Self Cite

View full text Add to dashboard Cite

Abstract-We propose a novel approach for predicting on-going action with the assistance of a low-cost depth camera. Our approach introduces a soft regression-based early prediction framework. In this framework, we estimate soft labels for the subsequences at different progress levels, jointly learned with an action predictor. Our formulation of soft regression framework 1) overcomes a usual assumption in existing early action prediction systems that the progress level of on-going sequence is given in the testing stage; and 2) presents a theoretical framework to better resolve the ambiguity and uncertainty of subsequences at early performing stage. The proposed soft regression framework is further enhanced in order to take the relationships among subsequences and the discrepancy of soft labels over different classes into consideration, so that a Multiple Soft labels Recurrent Neural Network (MSRNN) is finally developed. For real-time performance, we also introduce a new RGB-D feature called "local accumulative frame feature (LAFF)", which can be computed efficiently by constructing an integral feature map. Our experiments on three RGB-D benchmark datasets and an unconstrained RGB action set demonstrate that the proposed regression-based early action prediction model outperforms existing models significantly and also show that the early action prediction on RGB-D sequence is more accurate than that on RGB channel.

show abstract

DAG-Recurrent Neural Networks for Scene Labeling

Cited by 122 publications

References 34 publications

A survey on deep learning techniques for image and video semantic segmentation

A survey on deep learning techniques for image and video semantic segmentation

Topological Recurrent Neural Network for Diffusion Prediction

Early Action Prediction by Soft Regression

Contact Info

Product

Resources

About