2021 IEEE International Conference on Multimedia and Expo (ICME) 2021
DOI: 10.1109/icme51207.2021.9428224
|View full text |Cite
|
Sign up to set email alerts
|

Learned Image Coding for Machines: A Content-Adaptive Approach

Abstract: Today, according to the Cisco Annual Internet Report (2018)(2019)(2020)(2021)(2022)(2023), the fastest-growing category of Internet traffic is machine-to-machine communication. In particular, machineto-machine communication of images and videos represents a new challenge and opens up new perspectives in the context of data compression. One possible solution approach consists of adapting current human-targeted image and video coding standards to the use case of machine consumption. Another approach consists of … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
20
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
2

Relationship

1
6

Authors

Journals

citations
Cited by 35 publications
(20 citation statements)
references
References 12 publications
0
20
0
Order By: Relevance
“…2(b), which combines compression and machine vision analysis network structure and devises joint optimization strategies. Some methods [25]- [30], which are based on existing learned image compression frameworks, obtain the reconstructed image more appropriate for analysis through joint learning. However, in most cases, the quality of the image suffers.…”
Section: B Feature Codingmentioning
confidence: 99%
“…2(b), which combines compression and machine vision analysis network structure and devises joint optimization strategies. Some methods [25]- [30], which are based on existing learned image compression frameworks, obtain the reconstructed image more appropriate for analysis through joint learning. However, in most cases, the quality of the image suffers.…”
Section: B Feature Codingmentioning
confidence: 99%
“…As new technologies for video applications (e.g., virtual reality, augmented reality, and point clouds) revolutionize the video coding industry, the heterogeneity and complexity of the captured data are presenting increasing challenges in the efficient compression of these data. Based on a review of the methods applied to date for video RC in the ML and DL domains, this paper argues that future ML and DL techniques can help to achieve smarter video coding [142]- [144].…”
Section: Future Workmentioning
confidence: 99%
“…These terms are defined in the same way as in [5], except for L task which is replaced by a proxy loss L proxy in our setup. Due to the correlations in the intermediate level features of different vision tasks [9], we can use an intermediate feature distortion metric as a proxy for L task , thus making the codec task-agnostic. Additionally, using a feature-based loss as such enables the training of the model with cropped images which is much more efficient.…”
Section: Baseline Image Codec Modelmentioning
confidence: 99%
“…Additionally, using a feature-based loss as such enables the training of the model with cropped images which is much more efficient. Similar to [9,18], we define L proxy as follows:…”
Section: Baseline Image Codec Modelmentioning
confidence: 99%
See 1 more Smart Citation