In a recent work, the authors proposed a novel paradigm for interactive video streaming and coined the term JPEG2000-Based Scalable Interactive Video (JSIV) for it. In this work, we investigate JSIV when motion compensation is employed to improve prediction, something that was intentionally left out in our earlier treatment. JSIV relies on three concepts: storing the video sequence as independent JPEG2000 frames to provide quality and spatial resolution scalability, prediction and conditional replenishment of code-blocks to exploit inter-frame redundancy, and loosely coupled server and client policies in which a server optimally selects the number of quality layers for each code-block transmitted and a client makes the most of the received (distorted) frames. In JSIV, the server transmission problem is optimally solved using Lagrangian-style rate-distortion optimization. The flexibility of JSIV enables us to employ a wide variety of frame prediction arrangements, including hierarchical B-frames. JSIV provides considerably better interactivity compared with existing schemes and can adapt immediately to interactive changes in client interests, such as forward or backward playback and zooming into individual frames. Experimental results show that JSIV's performance is inferior to that of SVC in conventional streaming applications while JSIV performs better in interactive browsing applications.
A video stored as a sequence of JPEG2000 images can provide the scalability, flexibility, and accessibility that is lacking in current predictive motion-compensated video coding standards; however, streaming this sequence would consume considerably more bandwidth. This paper presents a new method for optimized streaming of a JPEG2000 video that relies on motion compensation and server-optimized conditional replenishment to reduce temporal redundancy, in collaboration with an intelligent client policy for reconstructing the available content. In particular, we propose transmission of motion vectors and an optimized number of layers, possibly zero, for each code-block of the JPEG2000 representation of each new frame. We also propose the use of a sliding window to optimize a group of frames such that codeblocks of these frames have more than one chance of being enhanced if that is beneficial to subsequent frames. Ratedistortion optimization in the Lagrangian sense is employed to achieve the lowest possible MSE. It is expected that mobile clients with their limited processing powers would benefit from this work in real-time and interactive applications, such as teleconferencing and surveillance. This paper introduces the concept, formulates optimization criteria, and compares the performance with alternative strategies.
We propose a novel paradigm for interactive video streaming and we coin the term JPEG2000-based scalable interactive video (JSIV) for it. JSIV utilizes JPEG2000 to independently compress the original video sequence frames and provide for quality and spatial resolution scalability. To exploit interframe redundancy, JSIV utilizes prediction and conditional replenishment of code-blocks aided by a server policy that optimally selects the number of quality layer for each code-block transmitted and a client policy that makes most of the received (distorted) frames. It is also possible for JSIV to employ motion compensation; however, we leave this topic to future work. To optimally solve the server transmission problem, a Lagrangian-style rate-distortion optimization procedure is employed. In JSIV, a wide variety of frame prediction arrangements can be employed including hierarchical B-frames of the scalable video coding (SVC) extension of the H.264/AVC standard. JSIV provides considerably better interactivity compared to existing schemes and can adapt immediately to interactive changes in client interests, such as forward or backward playback and zooming into individual frames. Experimental results for surveillance footage, which does not suffer from the absence of motion compensation, show that JSIV's performance is comparable to that of SVC in some usage scenarios while JSIV performs better in others.
Streaming video as a sequence of JPEG2000 images provides the scalability, flexibility, and accessibility at a wide range of bit-rates that is lacking from the current motion-compensated predictive video coding standards; however, streaming this sequence requires considerably more bandwidth. The authors have recently proposed a novel approach that reduces the required bandwidth; this approach uses motion compensation and conditional replenishment of the JPEG2000 code-blocks, aided by server-optimized selection of these code-blocks. This work extends the previous work to the case of hierarchical arrangement of frames, similar to the hierarchical B-frames of the SVC scalable video coding extension of the H.264/AVC standard. We employ a Lagrangian-style rate-distortion optimization procedure to the server transmission problem and compare the performance to that of streaming individual frames and also to that of predictive video coding. The proposed approach can serve a diverse range of client requirements and can adapt immediately to interactive changes in client interests, such as forward or backward playback and zooming into individual frames. This paper introduces the concepts, formulates the optimization problem, proposes a solution, and compares the performance to alternate strategies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.