2021 IEEE International Conference on Image Processing (ICIP) 2021
DOI: 10.1109/icip42928.2021.9506150
|View full text |Cite
|
Sign up to set email alerts
|

Human-Machine Collaborative Video Coding Through Cuboidal Partitioning

Abstract: Video coding algorithms encode and decode an entire video frame while feature coding techniques only preserve and communicate the most critical information needed for a given application. This is because video coding targets human perception, while feature coding aims for machine vision tasks. Recently, attempts are being made to bridge the gap between these two domains. In this work, we propose a video coding framework by leveraging on to the commonality that exists between human vision and machine vision app… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(3 citation statements)
references
References 19 publications
0
3
0
Order By: Relevance
“…Ahmmed et al [28] proposed a cuboids-based human-machine collaborative video coding method. The method generates a cuboid map by extracting cuboidal features from a video frame and takes the mean pixel intensity value of each cuboid area.…”
Section: ) Image Compression Researches For Machine Vision or Human-m...mentioning
confidence: 99%
“…Ahmmed et al [28] proposed a cuboids-based human-machine collaborative video coding method. The method generates a cuboid map by extracting cuboidal features from a video frame and takes the mean pixel intensity value of each cuboid area.…”
Section: ) Image Compression Researches For Machine Vision or Human-m...mentioning
confidence: 99%
“…Recently, attempts are being made to bridge the gap between these two domains. In work [69], authors propose a video coding framework by leveraging on to the commonality that exists between human vision and computer vision applications using cuboids. This is because cuboids, estimated rectangular regions over a video frame, are computationally efficient, has a compact representation and object centric.…”
Section: Video Coding Scheme For Specific Computer Vision Tasksmentioning
confidence: 99%
“…additional information in the enhancement layer, these methods also support high-quality input reconstruction for human vision. Meanwhile, [14], [15] tried to recover the input image directly from features, without an additional bitstream: [14] from intermediate-layer activations of YOLOv2 [16], and [15] from cuboidal features targeted at YOLOv2. If X denotes the input image, Y the latent space features, X the reconstructed image, and T the machine task output, then a typical machine vision pipeline can be described by a Markov chain X → Y → X → T .…”
Section: Introductionmentioning
confidence: 99%