Deep Learning for Content-Based Personalized Viewport Prediction of 360-Degree VR Videos

Chen, Xinwei; Kasgari, Ali Taleb Zadeh; Saad, Walid

doi:10.1109/lnet.2020.2977124

Cited by 24 publications

(15 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Kan et al [8] proposed a DRL-based rate adaptation algorithm qualified for learning a policy to strike an optimal trade-off between the video quality, the risk of rebuffering, and the smoothness of video quality by selecting the proper bitrate of tiles. Previous works [9][10][11][12] designed a hybrid architecture of CNN and LSTM models. ey used convolutional neural networks (CNN) and LSTM to extract video content features from saliency maps, original images, or patterns of motion from historical rotation information.…”

Section: Viewport Predictionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

“…e most advanced VR streaming research mainly focuses on viewport prediction [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16]. e existing solutions [1,[11][12][13] suggest prefetching all the tiles of each segment, and higher quality of prefetching predicts the tiles in the viewport. Most existing viewport prediction algorithms can be divided into trajectory-based [1][2][3][4][5][6] and content-based [7][8][9][10][11][12][13][14][15][16] methods.…”

Section: Introductionmentioning

confidence: 99%

“…e existing solutions [1,[11][12][13] suggest prefetching all the tiles of each segment, and higher quality of prefetching predicts the tiles in the viewport. Most existing viewport prediction algorithms can be divided into trajectory-based [1][2][3][4][5][6] and content-based [7][8][9][10][11][12][13][14][15][16] methods. However, these methods are either difficult to achieve sufficient prediction accuracy, or they occupy a large amount of server resources in actual deployment and cannot achieve excellent real-time performance due to the complexity of their own algorithms.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Lightweight Neural Network-Based Viewport Prediction for Live VR Streaming in Wireless Video Sensor Network

Chen

Cao

Ahmad

2021

Mobile Information Systems

View full text Add to dashboard Cite

Live virtual reality (VR) streaming (a.k.a., 360-degree video streaming) has become increasingly popular because of the rapid growth of head‐mounted displays and 5G networking deployment. However, the huge bandwidth and the energy required to deliver live VR frames in the wireless video sensor network (WVSN) become bottlenecks, making it impossible for the application to be deployed more widely. To solve the bandwidth and energy challenges, VR video viewport prediction has been proposed as a feasible solution. However, the existing works mainly focuses on the bandwidth usage and prediction accuracy and ignores the resource consumption of the server. In this study, we propose a lightweight neural network-based viewport prediction method for live VR streaming in WVSN to overcome these problems. In particular, we (1) use a compressed channel lightweight network (C-GhostNet) to reduce the parameters of the whole model and (2) use an improved gate recurrent unit module (GRU-ECA) and C-GhostNet to process the video data and head movement data separately to improve the prediction accuracy. To evaluate the performance of our method, we conducted extensive experiments using an open VR user dataset. The experiments results demonstrate that our method achieves significant server resource saving, real-time performance, and high prediction accuracy, while achieving low bandwidth usage and low energy consumption in WVSN, which meets the requirement of live VR streaming.

show abstract

Section: Viewport Predictionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Lightweight Neural Network-Based Viewport Prediction for Live VR Streaming in Wireless Video Sensor Network

Chen

Cao

Ahmad

2021

Mobile Information Systems

View full text Add to dashboard Cite

show abstract

“…Xu et al [14] propose a deep reinforcement learning (DRL) based approach to better model the users' attention with video contents. The works in [16]- [19] design a hybrid architecture of CNN and LSTM models. They use a convolutional neural network (CNN) to extract video content features from saliency maps or original images, and use LSTM to extract motion patterns from history rotations.…”

Section: B Viewport Predictionmentioning

confidence: 99%

SVP: Sinusoidal Viewport Prediction for 360-Degree Video Streaming

et al. 2020

View full text Add to dashboard Cite

The rapid growth of user expectations and network technologies has proliferated the service needs of 360-degree video streaming. In the light of the unprecedented bitrates required to deliver entire 360-degree videos, tile-based streaming, which associates viewport and non-viewport tiles with different qualities, has emerged as a promising way to facilitate 360-degree video streaming in practice. Existing work on viewport prediction primarily targets prediction accuracy, which potentially gives rise to excessive computational overhead and latency. In this paper, we propose a sinusoidal viewport prediction (SVP) system for 360-degree video streaming to overcome the aforementioned issues. In particular, the SVP system leverages 1) sinusoidal values of rotation angles to predict orientation, 2) the relationship between prediction errors, prediction time window and head movement velocities to improve the prediction accuracy, and 3) the normalized viewing probabilities of tiles to further improve adaptive bitrate (ABR) streaming performance. To evaluate the performance of the SVP system, we conduct extensive simulations based on real-world datasets. Simulation results demonstrate that the SVP system outperforms state-of-the-art schemes under various buffer thresholds and bandwidth settings in terms of viewport prediction accuracy and video quality, revealing its applicability to both live and video-on-demand streaming in practical scenarios. INDEX TERMS 360-degree video, viewport prediction, live streaming, video on demand.

show abstract

Predicting user visual attention in virtual reality with a deep learning model

Shan

Chen

et al. 2021

Virtual Reality

View full text Add to dashboard Cite

Deep Learning for Content-Based Personalized Viewport Prediction of 360-Degree VR Videos

Cited by 24 publications

References 15 publications

Lightweight Neural Network-Based Viewport Prediction for Live VR Streaming in Wireless Video Sensor Network

Lightweight Neural Network-Based Viewport Prediction for Live VR Streaming in Wireless Video Sensor Network

SVP: Sinusoidal Viewport Prediction for 360-Degree Video Streaming

Predicting user visual attention in virtual reality with a deep learning model

Contact Info

Product

Resources

About