Learning-Based Scalable Image Compression With Latent-Feature Reuse and Prediction

Mei, Yixin; Li, Li; Li, Zhu; Li, Fan

doi:10.1109/tmm.2021.3114548

“…Inspired by traditional scalable video coding frameworks, scalable learned compression schemes [114][115][116] have been proposed, generating varying quality levels based on the layers of received bitstreams. Jia et al [114] introduced a scalable autoencoder (SAE) image compression network to mitigate the necessity of training multiple models for different bitrate points.…”

Section: Variable Rate Modelmentioning

confidence: 99%

“…The SAE-based deep image codec comprises hierarchical coding layers as the base and the enhancement layers. Mei et al [115] proposed a quality and spatial scalable image compression (QSSIC) model in a multi-layer structure, where each layer generates one bitstream corresponding to a specified resolution and image fidelity. This scalability is achieved by exploring the potential of feature-domain representation prediction and reuse.…”

Section: Variable Rate Modelmentioning

confidence: 99%

Unveiling the Future of Human and Machine Coding: A Survey of End-to-End Learned Image Compression

Huang,

Wu

2024

Entropy

0

View full text Add to dashboard Cite

End-to-end learned image compression codecs have notably emerged in recent years. These codecs have demonstrated superiority over conventional methods, showcasing remarkable flexibility and adaptability across diverse data domains while supporting new distortion losses. Despite challenges such as computational complexity, learned image compression methods inherently align with learning-based data processing and analytic pipelines due to their well-suited internal representations. The concept of Video Coding for Machines has garnered significant attention from both academic researchers and industry practitioners. This concept reflects the growing need to integrate data compression with computer vision applications. In light of these developments, we present a comprehensive survey and review of lossy image compression methods. Additionally, we provide a concise overview of two prominent international standards, MPEG Video Coding for Machines and JPEG AI. These standards are designed to bridge the gap between data compression and computer vision, catering to practical industry use cases.

show abstract

“…Inspirated by traditional scalable video coding frameworks, scalable learned compression schemes [114][115][116] have been proposed, generating varying quality levels based on the layers of received bitstreams. Jia et al [114] introduce a scalable autoencoder (SAE) image compression network to mitigate the necessity for training multiple models for different bitrate points.…”

Section: Variable Rate Modelmentioning

confidence: 99%

“…The SAE-based deep image codec comprises hierarchical coding layers as the base and the enhancement layers. Mei et al [115] propose a quality and spatial scalable image compression (QSSIC) model in a multi-layer structure, where each layer generates one bitstream corresponding to a specified resolution and image fidelity. This scalability is achieved by exploring the potential of feature-domain representation prediction and reuse.…”

Section: Variable Rate Modelmentioning

confidence: 99%

Unveiling the Future of Human and Machine Coding: A Survey of End-to-End Learned Image Compression

Huang,

Wu

2024

Preprint

0

View full text Add to dashboard Cite

End-to-end learned image compression codecs have notably emerged in recent years. These codecs have demonstrated superiority over conventional methods, showcasing remarkable flexibility and adaptability across diverse data domains while supporting new distortion losses. Despite challenges such as computational complexity, learned image compression methods inherently align with learning-based data processing and analytic pipelines due to their well-suited internal representations. The concept of Video Coding for Machines has garnered significant attention from both academic researchers and industry practitioners. This concept reflects the growing need to integrate data compression with computer vision applications. In light of these developments, we present a comprehensive survey and review of lossy image compression methods. Additionally, we provide a concise overview of two prominent international standards, MPEG Video Coding for Machines and JPEG AI. These standards are designed to bridge the gap between data compression and computer vision, catering to practical industry use cases.

show abstract

“…in Ref. 18 also reused the latent features observed in different layers to realize image compression and reconstruction. In this model, reconstructed images with scalable quality are acquired with scalable bitstreams.…”

Section: Related Workmentioning

confidence: 99%

Deep feature extraction and compression for human–machine vision

Huang

¹

,

An

²

,

Yang

³

et al. 2023

J. Electron. Imag.

0

View full text Add to dashboard Cite

Latent representation features in deep learning (DL) exhibit excellent potential for visual data applications. For example, in traffic monitoring and video surveillance, the features simultaneously perform image analysis for machine vision and image reconstruction for human viewing. However, the existing deep features that appeal to machine and human receivers are always combinations of separated pieces and specific features. Due to these features being extracted from different branches in collaboration frameworks, the inherent relations between machine and human vision are insufficiently explored. Therefore, to obtain one set of representative and generic features, we propose a dynamic groupwise splitting network based on image content to explore and extract generic features for the two different receivers. First, we analyze the characteristics of the latent features and adopt intermediate features as the base features. Then, a feature classification and transformation mechanism based on image content is proposed to enhance the base features for further image reconstruction and analysis. Consequently, an end-to-end model with multimodel cascading and multistage training realizes both machine and human vision tasks. Extensive experiments show that our human-machine vision collaboration framework has high practical value and performance.

show abstract

Learning-Based Scalable Image Compression With Latent-Feature Reuse and Prediction

Cited by 13 publications

References 26 publications

Unveiling the Future of Human and Machine Coding: A Survey of End-to-End Learned Image Compression

Unveiling the Future of Human and Machine Coding: A Survey of End-to-End Learned Image Compression

Unveiling the Future of Human and Machine Coding: A Survey of End-to-End Learned Image Compression

Deep feature extraction and compression for human–machine vision

Contact Info

Product

Resources

About