Very Long Term Field of View Prediction for 360-Degree Video Streaming

Li, Chenge; Zhang, Weixi; Liu, Yong; Wang, Yao

doi:10.1109/mipr.2019.00060

Cited by 54 publications

(36 citation statements)

References 15 publications

(24 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Petrangeli et al [11] first identify user clusters with a kind of spectral clustering algorithm, then fit a regression model for each cluster, finally predict with the regression model from the user's corresponding cluster. Li et al [12] utilize LSTM and Attentive Mixture Experts (AME) techniques to train a model to predict based on both the target user's historical fixations and other users' fixations. Nasrabadi et al [13] first cluster users based on their quaternion rotations, and then classify the target user to the corresponding cluster and estimate the future fixation as the cluster center.…”

Section: B Viewport Predictionmentioning

confidence: 99%

SVP: Sinusoidal Viewport Prediction for 360-Degree Video Streaming

et al. 2020

View full text Add to dashboard Cite

The rapid growth of user expectations and network technologies has proliferated the service needs of 360-degree video streaming. In the light of the unprecedented bitrates required to deliver entire 360-degree videos, tile-based streaming, which associates viewport and non-viewport tiles with different qualities, has emerged as a promising way to facilitate 360-degree video streaming in practice. Existing work on viewport prediction primarily targets prediction accuracy, which potentially gives rise to excessive computational overhead and latency. In this paper, we propose a sinusoidal viewport prediction (SVP) system for 360-degree video streaming to overcome the aforementioned issues. In particular, the SVP system leverages 1) sinusoidal values of rotation angles to predict orientation, 2) the relationship between prediction errors, prediction time window and head movement velocities to improve the prediction accuracy, and 3) the normalized viewing probabilities of tiles to further improve adaptive bitrate (ABR) streaming performance. To evaluate the performance of the SVP system, we conduct extensive simulations based on real-world datasets. Simulation results demonstrate that the SVP system outperforms state-of-the-art schemes under various buffer thresholds and bandwidth settings in terms of viewport prediction accuracy and video quality, revealing its applicability to both live and video-on-demand streaming in practical scenarios. INDEX TERMS 360-degree video, viewport prediction, live streaming, video on demand.

show abstract

Section: B Viewport Predictionmentioning

confidence: 99%

SVP: Sinusoidal Viewport Prediction for 360-Degree Video Streaming

et al. 2020

View full text Add to dashboard Cite

show abstract

“…FoV prediction is critical to the performance of FoV-adaptive streaming. In the past, linear regression, weighted linear regression, and truncated linear prediction [5,17,18] as well as neural network based methods [2,6,12] have been proposed. Most of these methods can predict the short-term FoV (within the future 1 second) well with an accuracy of more than 90% [17][12] [6].…”

Section: Predicting Fov and Bandwidthmentioning

confidence: 99%

Low-latency FoV-adaptive Coding and Streaming for Interactive 360° Video Streaming

Mao

Sun

Liu

et al. 2020

Proceedings of the 28th ACM International Conference on Multimedia

Self Cite

View full text Add to dashboard Cite

In 360 • video interactive streaming, it is critical to minimize the end-to-end frame delay. It is also important to predict the user's field of video (FoV) and allocate more bits in regions within the predicted FoV. Towards both goals, we propose a low-delay FoVadaptive coding and delivery system that is robust to bandwidth variations and FoV prediction errors. Each frame is coded only in the predicted FoV (PF), a border surrounding the predicted FoV (PF+), and a rotating intra (RI) region. To maximize the coding efficiency, the PF and PF+ regions are coded with temporal and spatial prediction, while the RI region is coded with spatial prediction only. The RI region enables periodic refreshment of the entire frame and provides robustness to both FoV prediction errors and frame losses. The total bit budget is adapted both at the segment level based on the predicted average bandwidth for the segment and at the frame level based on the sender buffer status, to ensure timely delivery. The system further adapts the sizes and coding rates of different regions for each video segment to maximize the average rendered video quality under the total bit budget. To enable such adaptation, we propose novel ways to model the quality-rate (Q-R) relations of coded regions that take into account of potentially misaligned coded regions in successive frames due to FoV dynamics. We examine the performance of the proposed system and three benchmark systems, under real-world bandwidth traces and FoV traces, and demonstrate that the proposed system can significantly improve the rendered video quality over the benchmark systems. Furthermore, the proposed system can achieve very low end-to-end frame delay while maintaining a low frame freeze probability and providing smooth video playback. CCS CONCEPTS • Human-centered computing → Virtual reality; • Information systems → Multimedia streaming.

show abstract

“…As such,the major part of the available resources can be allocated to this particular part of the video hemisphere, resulting in the perceived quality being optimized. Two major types of viewport prediction can be identified: content-based [22] and content-agnostic [23] pre-…”

Section: Accurate and Real-time Viewport Prediction: Knowing The Umentioning

confidence: 99%

“…Content-based prediction aims at characterizing the given content, independent from the user, in terms of saliency maps, motion detection, Regions-Of-Interest (ROIs) etc. Based on this information, density functions can be created that estimate the probability of a generic user looking at a particular part of the hemisphere at a given time instant [22]. Contentagnostic approaches, on the other hand, do not take the content into consideration, but try to predict the user's future fixation point based on this and other user's historical movement.…”

Section: Authorsmentioning

confidence: 99%

Human-centric Quality Management of Immersive Multimedia Applications

Damme

Vega

Turck

2020

2020 6th IEEE Conference on Network Softwarization (NetSoft)

View full text Add to dashboard Cite

Augmented Reality (AR) and Virtual Reality (VR) multimodal systems are the latest trend within the field of multimedia. As they emulate the senses by means of omnidirectional visuals, 360°sound, motion tracking and touch simulation, they are able to create a strong feeling of presence and interaction with the virtual environment. These experiences can be applied for virtual training (Industry 4.0), tele-surgery (healthcare) or remote learning (education). However, given the strong time and task sensitiveness of these applications, it is of great importance to sustain the end-user quality, i.e. the Qualityof-Experience (QoE), at all times. Lack of synchronization and quality degradation need to be reduced to a minimum to avoid feelings of cybersickness or loss of immersiveness and concentration. This means that there is a need to shift the quality management from system-centered performance metrics towards a more human, QoE-centered approach. However, this requires for novel techniques in the three areas of the QoE-management loop (monitoring, modelling and control). This position paper identifies open areas of research to fully enable human-centric driven management of immersive multimedia. To this extent, four main dimensions are put forward: (1) Task and well-being driven subjective assessment; (2) Real-time QoE modelling; (3) Accurate viewport prediction; (4) Machine Learning (ML)-based quality optimization and content recreation. This paper discusses the state-of-the-art, and provides with possible solutions to tackle the open challenges.

show abstract

Very Long Term Field of View Prediction for 360-Degree Video Streaming

Cited by 54 publications

References 15 publications

SVP: Sinusoidal Viewport Prediction for 360-Degree Video Streaming

SVP: Sinusoidal Viewport Prediction for 360-Degree Video Streaming

Low-latency FoV-adaptive Coding and Streaming for Interactive 360° Video Streaming

Human-centric Quality Management of Immersive Multimedia Applications

Contact Info

Product

Resources

About