Learning Image and Video Compression Through Spatial-Temporal Energy Compaction

Cheng, Zhengxue; Sun, Heming; Takeuchi, Masaru; Katto, Jiro

doi:10.1109/cvpr.2019.01031

Cited by 99 publications

(78 citation statements)

References 26 publications

(74 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Most recently, several end-to-end deep video compression methods have been proposed [8,7,38,9,22,13]. Specifically, Wu et al [38] proposed predicting frames by interpolation from reference frames, and the image compression network of [31] is applied to compress the residual.…”

Section: Deep Video Compressionmentioning

confidence: 99%

“…In 2019, Lu et al [22] proposed the Deep Video Compression (DVC) method, in which optical flow is used to estimate the temporal motion, and two auto-encoders are employed to compress the motion and residual, respectively. Meanwhile, in [9], spatial-temporal energy compaction is added into the loss function to improve the performance of video compression. Later, Habibian et al [13] proposed the rate-distortion auto-encoder, which uses an autoregressive prior for video entropy coding.…”

Section: Deep Video Compressionmentioning

confidence: 99%

“…We compare HLVC with the latest learned video compression methods. Among them, Habibian et al [13] and Cheng et al [9] are optimized for MS-SSIM. DVC [22] and Wu et al [38] are optimized for PSNR.…”

Section: Settingsmentioning

confidence: 99%

“…Recent studies in learned image compression, e.g., [2,3], show the great potential of deep learning for improving the rate-distortion performance. It is therefore not surprising to see increasing interest in compressing video with Deep Neural Networks (DNNs) [8,38,9,22,13]. For example, Lu et al [22] proposed using optical flow for motion compensation and applying auto-encoders to compress the flow and residual.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Learning for Video Compression With Hierarchical Quality and Recurrent Enhancement

Yang

Mentzer

Gool

et al. 2020

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

164

182

View full text Add to dashboard Cite

In this paper, we propose a Hierarchical Learned Video Compression (HLVC) method with three hierarchical quality layers and a recurrent enhancement network. The frames in the first layer are compressed by an image compression method with the highest quality. Using these frames as references, we propose the Bi-Directional Deep Compression (BDDC) network to compress the second layer with relatively high quality. Then, the third layer frames are compressed with the lowest quality, by the proposed Single Motion Deep Compression (SMDC) network, which adopts a single motion map to estimate the motions of multiple frames, thus saving bits for motion information. In our deep decoder, we develop the Weighted Recurrent Quality Enhancement (WRQE) network, which takes both compressed frames and the bit stream as inputs. In the recurrent cell of WRQE, the memory and update signal are weighted by quality features to reasonably leverage multiframe information for enhancement. In our HLVC approach, the hierarchical quality benefits the coding efficiency, since the high quality information facilitates the compression and enhancement of low quality frames at encoder and decoder sides, respectively. Finally, the experiments validate that our HLVC approach advances the stateof-the-art of deep video compression methods, and outperforms the "Low-Delay P (LDP) very fast" mode of x265 in terms of both PSNR and MS-SSIM. The project page is at https://github.com/RenYang-home/HLVC.

show abstract

Section: Deep Video Compressionmentioning

confidence: 99%

Section: Deep Video Compressionmentioning

confidence: 99%

Section: Settingsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Learning for Video Compression With Hierarchical Quality and Recurrent Enhancement

Yang

Mentzer

Gool

et al. 2020

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

164

182

View full text Add to dashboard Cite

show abstract

“…Such proactive, predictive perceptual capabilities enable us to take desired actions and avoid dangerous situations. To make computing machines achieve a similar level to the human perception, motion understanding and representation have been studied in many computer vision tasks such as optical flow [1]- [3], object tracking [4]- [6], action recognition [7], future frame prediction [8], video interpolation [9], and video compression [10]. However, most conventional techniques depend on temporal information from multiple consecutive frames to estimate motions.…”

Section: Introductionmentioning

confidence: 99%

Instance-Level Future Motion Estimation in a Single Image Based on Ordinal Regression and Semi-Supervised Domain Adaptation

Kim

Koh

2020

IEEE Access

View full text Add to dashboard Cite

A novel algorithm to estimate instance-level future motion (FM) in a single image is proposed in this paper. First, the FM of an instance is defined with its direction, speed, and action classes. Then, a deep neural network, called FM-Net, is developed to determine the FM of the instance. More specifically, the multi-context pooling layer is proposed to exploit both object and global context features, and the cyclic ordinal regression scheme is developed using binary classifiers for effective FM classification. Also, the proposed FM-Net is trained in a semi-supervised domain adaptation setting to obtain reliable FM estimation results, even when a source domain in the training process and a target domain in the inference process are different. Extensive experimental results demonstrate that the proposed algorithm provides remarkable performance and thus can be used effectively for computer vision applications, including single object tracking, multiple object tracking, and crowd analysis. Furthermore, the FM dataset, collected from diverse sources and annotated manually, is released as a benchmark for single-image FM estimation.

show abstract

Content Adaptive and Error Propagation Aware Deep Video Compression

Cai

Zhang

et al. 2020

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Learning Image and Video Compression Through Spatial-Temporal Energy Compaction

Cited by 99 publications

References 26 publications

Learning for Video Compression With Hierarchical Quality and Recurrent Enhancement

Learning for Video Compression With Hierarchical Quality and Recurrent Enhancement

Instance-Level Future Motion Estimation in a Single Image Based on Ordinal Regression and Semi-Supervised Domain Adaptation

Content Adaptive and Error Propagation Aware Deep Video Compression

Contact Info

Product

Resources

About