Student Becoming the Master: Knowledge Amalgamation for Joint Scene Parsing, Depth Estimation, and More

Ye, Jingwen; Ji, Yixin; Wang, Xinchao; Ou, Kairi; Tao, Dapeng; Song, Mingli

doi:10.1109/cvpr.2019.00294

Cited by 69 publications

(29 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, determining the focal distance solely from a single two-dimensional image is not a straight-forward operation. One potential solution is to use a neural network estimate of the depth map from a two-dimensional image [27][28][29][30]. From our framework we extract the location of the strongest edges in the frame; therefore, we can map the edge strengths to the predicted depth map to obtain the focus distance.…”

Section: Depth Of Fieldmentioning

confidence: 99%

Camera System Performance Derived from Natural Scenes

Zwanenberg¹,

Triantaphillidou²,

Jenkin³

et al. 2020

View full text Add to dashboard Cite

show abstract

Section: Depth Of Fieldmentioning

confidence: 99%

Camera System Performance Derived from Natural Scenes

Zwanenberg¹,

Triantaphillidou²,

Jenkin³

et al. 2020

View full text Add to dashboard Cite

show abstract

“…Knowledge distillation methods have been widely used in many vision tasks, including object detection [30,6,13], line detection [20], semantic segmentation [62,18,34] and human pose estimation [66,40,56,58]. DOPE [58] proposes to distill the 2D and 3D poses from three independent body part expert models to the single whole-body pose detection model.…”

Section: Related Workmentioning

confidence: 99%

Online Knowledge Distillation for Efficient Pose Estimation

Song

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Existing state-of-the-art human pose estimation methods require heavy computational resources for accurate predictions. One promising technique to obtain an accurate yet lightweight pose estimator is knowledge distillation, which distills the pose knowledge from a powerful teacher model to a less-parameterized student model. However, existing pose distillation works rely on a heavy pre-trained estimator to perform knowledge transfer and require a complex two-stage learning procedure. In this work, we investigate a novel Online Knowledge Distillation framework by distilling Human Pose structure knowledge in a one-stage manner to guarantee the distillation efficiency, termed OKDHP. Specifically, OKDHP trains a single multi-branch network and acquires the predicted heatmaps from each, which are then assembled by a Feature Aggregation Unit (FAU) as the target heatmaps to teach each branch in reverse. Instead of simply averaging the heatmaps, FAU which consists of multiple parallel transformations with different receptive fields, leverages the multi-scale information, thus obtains target heatmaps with higher-quality. Specifically, the pixelwise Kullback-Leibler (KL) divergence is utilized to minimize the discrepancy between the target heatmaps and the predicted ones, which enables the student network to learn the implicit keypoint relationship. Besides, an unbalanced OKDHP scheme is introduced to customize the student networks with different compression rates. The effectiveness of our approach is demonstrated by extensive experiments on two common benchmark datasets, MPII and COCO.

show abstract

“…Recently, image segmentation has been used in a variety of computer vision tasks, such as depth prediction [37], virtual try-on [20], scene understanding [38], [39] and image generation [40]. The use of SLIC can obtain color parsing to extend the line hint to the local area, thereby providing more color information to the interactive colorization.…”

Section: ) Global Hintmentioning

confidence: 99%

Two-Stage Sketch Colorization With Color Parsing

Hui

Gao

2020

IEEE Access

View full text Add to dashboard Cite

We implement high-quality sketch colorization using two-stage conditional generative adversarial network (GAN) training based on different intermediate features. The intermediate features used in autonomous colorization are the grayscale parsing and interval pixel-level color parsing. The autonomous colorization based on grayscale parsing feature can learn the spacial topology of pixels in the first stage to guide the colorization in the second stage. The autonomous colorization based on pixel-level color parsing feature can learn the color information of few feature points in the first stage to guide the colorization of all pixels in the second stage. Additionally, we use the intermediate feature of sampling points as constraint and achieve the color reconstruction using Laplacian mesh editing as a special second stage. Furthermore, the interactive colorization uses the superpixel color parsing as the intermediate feature. Specifically, we use the simple linear iterative cluster (SLIC) to obtain a palette that maintains the edges in the first stage to guide the colorization in the second stage. As for evaluation metrics, we propose a color-coded local binary pattern (CCLBP) score based on color distances from the first-order 8 pixels to the central pixel, to measure the degrees of color blurring and mess. We also propose a light-sensitivity (LS) score based on the reversed grayscale map, to measure the degrees of auto painting and overfitting of the color hint. According to the L1 distances between the original and generated color images based on these scores, compared with state-of-the-art methods including one stage approaches such as pix2pix and PaintsChainer and two-stage approaches such as Style2Paints and DeepColor, our model can achieve the highest-quality autonomous colorization. Moreover, compared with pix2pix, PaintsChainer and Style2Paints with color hints, according to the proposed objective evaluation as well as the user visual study, our model can achieve the highest-quality interactive colorization as well.

show abstract

Student Becoming the Master: Knowledge Amalgamation for Joint Scene Parsing, Depth Estimation, and More

Cited by 69 publications

References 27 publications

Camera System Performance Derived from Natural Scenes

Camera System Performance Derived from Natural Scenes

Online Knowledge Distillation for Efficient Pose Estimation

Two-Stage Sketch Colorization With Color Parsing

Contact Info

Product

Resources

About