Abstract-This paper builds on an interactive streaming architecture that supports both user feedback interpretation, and temporal juxtaposition of multiple video bitstreams in a single streaming session. As an original contribution, these functionalities can be exploited to offer improved viewing experience, when accessing football content through individual and potentially bandwidth constrained connections. Starting from a conventional broadcasted content, our system automatically splits the native content into non-overlapping and semantically consistent segments. Each segment is then divided into shots, based on conventional view boundary detection. Shots are finally splitted in small clips. These clips support our browsing capabilities during the whole playback in a temporally consistent way. Multiple versions are automatically created to render each clip. Versioning depends on the view type of the initial shot, and typically corresponds to the generation of zoomed in and spatially or temporally subsampled video streams. Clips are encoded independently so that the server can decide on the fly the version to send as a function of the semantic relevance of the segments (in a user-transparent basis, as inferred from video analysis or metadata) and the interactive user requests. Replaying certain game actions is also offered upon request. The streaming is automatically switched to the requested event. Later, the playback is resumed without any offset. The capabilities of our system rely on the H.264/AVC standard. We use soccer videos to validate our framework in subjective experiments showing the feasibility and relevance of our system.
This paper describes an interactive and adaptive streaming architecture that exploits temporal concatenation of H.264/AVC video bit-streams to dynamically adapt to both user commands and network conditions. The architecture has been designed to improve the viewing experience when accessing video content through individual and potentially bandwidth constrained connections. On the one hand, the user commands typically gives the client the opportunity to select interactively a preferred version among the multiple video clips that are made available to render the scene, e.g. using different view angles, or zoomed-in and slow-motion factors. On the other hand, the adaptation to the network bandwidth ensures effective management of the client buffer, which appears to be fundamental to reduce the client-server interaction latency, while maximizing video quality and preventing buffer underflow. In addition to user interaction and network adaptation, the deployment of fully autonomous infrastructures for interactive content distribution also requires the development of automatic versioning methods. Hence, the paper also surveys a number of approaches proposed for this purpose in surveillance and sport event contexts. Both objective metrics and subjective experiments are exploited to assess our system
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.