Kwanghoon Sohn scite author profile

Traditional techniques for emotion recognition have focused on the facial expression analysis only, thus providing limited ability to encode context that comprehensively represents the emotional responses. We present deep networks for context-aware emotion recognition, called CAER-Net, that exploit not only human facial expression but also context information in a joint and boosting manner. The key idea is to hide human faces in a visual scene and seek other contexts based on an attention mechanism. Our networks consist of two sub-networks, including two-stream encoding networks to separately extract the features of face and context regions, and adaptive fusion networks to fuse such features in an adaptive fashion. We also introduce a novel benchmark for context-aware emotion recognition, called CAER, that is more appropriate than existing benchmarks both qualitatively and quantitatively. On several benchmarks, CAER-Net proves the effect of context for emotion recognition. Our dataset is available at

show abstract

Gradient-Enhancing Conversion for Illumination-Robust Lane Detection

Yoo

Yang

Sohn

2013

IEEE Trans. Intell. Transport. Syst.

192

View full text Add to dashboard Cite

Visual Fatigue Prediction for Stereoscopic Image

Kim

Sohn

2011

IEEE Trans. Circuits Syst. Video Technol.

108

View full text Add to dashboard Cite

Deep Monocular Depth Estimation via Integration of Global and Local Predictions

Kim

Jung

Min

et al. 2018

IEEE Trans. on Image Process.

View full text Add to dashboard Cite

Recent works on machine learning have greatly advanced the accuracy of single image depth estimation. However, the resulting depth images are still over-smoothed and perceptually unsatisfying. This paper casts depth prediction from single image as a parametric learning problem. Specifically, we propose a deep variational model that effectively integrates heterogeneous predictions from two convolutional neural networks (CNNs), named global and local networks. They have contrasting network architecture and are designed to capture depth information with complementary attributes. These intermediate outputs are then combined in the integration network based on the variational framework. By unrolling the optimization steps of Split Bregman (SB) iterations in the integration network, our model can be trained in an end-to-end manner. This enables one to simultaneously learn an efficient parameterization of the CNNs and hyper-parameter in the variational method. Finally, we offer a new dataset of 0.22 million RGB-D images captured by Microsoft Kinect v2. Our model generates realistic and discontinuity-preserving depth prediction without involving any low-level segmentation or superpixels. Intensive experiments demonstrate the superiority of the proposed method in a range of RGB-D benchmarks including both indoor and outdoor scenarios.

show abstract

PARN: Pyramidal Affine Regression Networks for Dense Semantic Correspondence

Jeon

Kim

Min

et al. 2018

View full text Add to dashboard Cite

This paper presents a deep architecture for dense semantic correspondence, called pyramidal affine regression networks (PARN), that estimates locally-varying affine transformation fields across images. To deal with intra-class appearance and shape variations that commonly exist among different instances within the same object category, we leverage a pyramidal model where affine transformation fields are progressively estimated in a coarse-to-fine manner so that the smoothness constraint is naturally imposed within deep networks. PARN estimates residual affine transformations at each level and composes them to estimate final affine transformations. Furthermore, to overcome the limitations of insufficient training data for semantic correspondence, we propose a novel weakly-supervised training scheme that generates progressive supervisions by leveraging a correspondence consistency across image pairs. Our method is fully learnable in an end-to-end manner and does not require quantizing infinite continuous affine transformation fields. To the best of our knowledge, it is the first work that attempts to estimate dense affine transformation fields in a coarse-to-fine manner within deep networks. Experimental results demonstrate that PARN outperforms the state-of-the-art methods for dense semantic correspondence on various benchmarks.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kwanghoon Sohn

Context-Aware Emotion Recognition Networks

Gradient-Enhancing Conversion for Illumination-Robust Lane Detection

Visual Fatigue Prediction for Stereoscopic Image

Deep Monocular Depth Estimation via Integration of Global and Local Predictions

PARN: Pyramidal Affine Regression Networks for Dense Semantic Correspondence

Contact Info

Product

Resources

About