Debargha Mukherjee scite author profile

This paper addresses the problem of encoder optimization in a macroblock-based multi-mode video compression system. An e cient solution is proposed in which, for a given image region, the optimum combination of macroblock modes and the associated mode parameters are jointly selected so as to minimize the overall distortion for a given bit-rate budget. Conditions for optimizing the encoder operation are derived within a rate-constrained product code framework using a Lagrangian formulation. The instantaneous rate of the encoder is controlled by a single Lagrange multiplier that makes the method amenable to mobile wireless networks with time-varying capacity. When rate and distortion dependencies are introduced between adjacent blocks as is the case when the motion vectors are di erentially encoded and or overlapped block motion compensation is employed, the ensuing encoder complexity is surmounted using dynamic programming. Due to the generic nature of the algorithm, it can be successfully applied to the problem of encoder control in numerous video coding standards, including H.261, MPEG-1, and MPEG-2. Moreover, the strategy is especially relevant for very low bit rate coding over wireless communication channels where the low dimensionality of the images associated with these bit rates makes real-time implementation very feasible. Accordingly, in this paper the method is successfully applied to the emerging H.263 video coding standard with excellent results at rates as low as 8.0 Kbits per second. Direct comparisons with the H.263 test model, TMN5, demonstrate that gains in PSNR are achievable over a wide range of rates.2

show abstract

The latest open-source video codec VP9 - An overview and preliminary results

Mukherjee

Bankoski

Grange

et al. 2013

190

View full text Add to dashboard Cite

Learning-Based, Automatic 2D-to-3D Image and Video Conversion

Konrad

Wang

Ishwar

et al. 2013

IEEE Trans. on Image Process.

View full text Add to dashboard Cite

Despite a significant growth in the last few years, the availability of 3D content is still dwarfed by that of its 2D counterpart. To close this gap, many 2D-to-3D image and video conversion methods have been proposed. Methods involving human operators have been most successful but also time-consuming and costly. Automatic methods, which typically make use of a deterministic 3D scene model, have not yet achieved the same level of quality for they rely on assumptions that are often violated in practice. In this paper, we propose a new class of methods that are based on the radically different approach of learning the 2D-to-3D conversion from examples. We develop two types of methods. The first is based on learning a point mapping from local image/video attributes, such as color, spatial position, and, in the case of video, motion at each pixel, to scene-depth at that pixel using a regression type idea. The second method is based on globally estimating the entire depth map of a query image directly from a repository of 3D images ( image+depth pairs or stereopairs) using a nearest-neighbor regression type idea. We demonstrate both the efficacy and the computational efficiency of our methods on numerous 2D images and discuss their drawbacks and benefits. Although far from perfect, our results demonstrate that repositories of 3D content can be used for effective 2D-to-3D image conversion. An extension to video is immediate by enforcing temporal continuity of computed depth maps.

show abstract

Optimal adaptation decision-taking for terminal and network quality-of-service

Mukherjee

Delfosse

Kim

et al. 2005

IEEE Trans. Multimedia

View full text Add to dashboard Cite

In order to cater to the diversity of terminals and networks, efficient, and flexible adaptation of multimedia content in the delivery path to end consumers is required. To this end, it is necessary to associate the content with metadata that provides the relationship between feasible adaptation choices and various media characteristics obtained as a function of these choices. Furthermore, adaptation is driven by specification of terminal, network, user preference or rights based constraints on media characteristics that are to be satisfied by the adaptation process. Using the metadata and the constraint specification, an adaptation engine can take an appropriate decision for adaptation, efficiently and flexibly. MPEG-21 Part 7 entitled Digital Item Adaptation standardizes among other things the metadata and constraint specifications that act as interfaces to the decision-taking component of an adaptation engine. This paper presents the concepts behind these tools in the standard, shows universal methods based on pattern search to process the information in the tools to make decisions, and presents some adaptation use cases where these tools can be used.

show abstract

Video Super-Resolution Using Codebooks Derived From Key-Frames

Hung

Queiroz

Brandi

et al. 2012

IEEE Trans. Circuits Syst. Video Technol.

View full text Add to dashboard Cite

A Technical Overview of AV1

Han

Mukherjee

et al. 2021

Proc. IEEE

View full text Add to dashboard Cite

show abstract

A Technical Overview of VP9—The Latest Open-Source Video Codec

Mukherjee¹,

Han²,

Bankoski³

et al. 2015

SMPTE Mot. Imag. J

View full text Add to dashboard Cite

Super-resolution of video using key frames and motion estimation

Brandi

Queiroz

Mukherjee

2008

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.