2022 IEEE 24th International Workshop on Multimedia Signal Processing (MMSP) 2022
DOI: 10.1109/mmsp55362.2022.9949999
|View full text |Cite
|
Sign up to set email alerts
|

Video Coding for Machines: Large-Scale Evaluation of Deep Neural Networks Robustness to Compression Artifacts for Semantic Segmentation

Abstract: In the Video Coding for Machines (VCM) context where visual content is compressed before being transmitted to a vision task algorithm, appropriate trade-off between the compression level and the vision task performance must be chosen. In this paper, a Deep Neural Networks (DNN) based semantic segmentation algorithm robustness to compression artifacts is evaluated with a total of 1486 different coding configurations. Results indicate the importance of using an appropriate image resolution to overcome the block-… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 24 publications
(37 reference statements)
0
0
0
Order By: Relevance
“…We refer to a lossy compression scheme as a scheme involving the following steps: image downsampling, encoding, decoding and upsampling back to original image resolution. As it has been shown that image downsampling was a crucial part to obtain optimal tradeoff between rate and DNN performance [14], [18], we include it in the lossy compression scheme through bicubic interpolation under 4 different downsampling factors. JPEG, JM, x265 and VVenC lossy compression algorithms, each with 11 different levels of quantification are applied.…”
Section: A Built Datasetmentioning
confidence: 99%
See 1 more Smart Citation
“…We refer to a lossy compression scheme as a scheme involving the following steps: image downsampling, encoding, decoding and upsampling back to original image resolution. As it has been shown that image downsampling was a crucial part to obtain optimal tradeoff between rate and DNN performance [14], [18], we include it in the lossy compression scheme through bicubic interpolation under 4 different downsampling factors. JPEG, JM, x265 and VVenC lossy compression algorithms, each with 11 different levels of quantification are applied.…”
Section: A Built Datasetmentioning
confidence: 99%
“…We denote the model trained with pristine image I and GT labels as S 0 . Once trained, pseudo GT predictions P can be obtained by inputting the #D validation images I to the DNN model S 0 .As shown by the literature, a DNN trained on losslessly compressed images I such as original Cityscapes dataset generalizes poorly to compressed images Î, as DNN would encounter artifacts that were not present at training time[7],[17],[18]. To mitigate this bias, progressive training[18] is employed to obtain segmentation models S Ci that are resilient to artifacts generated from coding configuration C i , i ∈ {1, 2, 2 .…”
mentioning
confidence: 99%