2020
DOI: 10.1109/lnet.2020.2977124
|View full text |Cite
|
Sign up to set email alerts
|

Deep Learning for Content-Based Personalized Viewport Prediction of 360-Degree VR Videos

Abstract: In this paper, the problem of head movement prediction for virtual reality videos is studied. In the considered model, a deep learning network is introduced to leverage position data as well as video frame content to predict future head movement. For optimizing data input into this neural network, data sample rate, reduced data, and long-period prediction length are also explored for this model. Simulation results show that the proposed approach yields 16.1% improvement in terms of prediction accuracy compared… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
15
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 24 publications
(15 citation statements)
references
References 15 publications
0
15
0
Order By: Relevance
“…Kan et al [8] proposed a DRL-based rate adaptation algorithm qualified for learning a policy to strike an optimal trade-off between the video quality, the risk of rebuffering, and the smoothness of video quality by selecting the proper bitrate of tiles. Previous works [9][10][11][12] designed a hybrid architecture of CNN and LSTM models. ey used convolutional neural networks (CNN) and LSTM to extract video content features from saliency maps, original images, or patterns of motion from historical rotation information.…”
Section: Viewport Predictionmentioning
confidence: 99%
See 3 more Smart Citations
“…Kan et al [8] proposed a DRL-based rate adaptation algorithm qualified for learning a policy to strike an optimal trade-off between the video quality, the risk of rebuffering, and the smoothness of video quality by selecting the proper bitrate of tiles. Previous works [9][10][11][12] designed a hybrid architecture of CNN and LSTM models. ey used convolutional neural networks (CNN) and LSTM to extract video content features from saliency maps, original images, or patterns of motion from historical rotation information.…”
Section: Viewport Predictionmentioning
confidence: 99%
“…e most advanced VR streaming research mainly focuses on viewport prediction [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16]. e existing solutions [1,[11][12][13] suggest prefetching all the tiles of each segment, and higher quality of prefetching predicts the tiles in the viewport.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…Xu et al [14] propose a deep reinforcement learning (DRL) based approach to better model the users' attention with video contents. The works in [16]- [19] design a hybrid architecture of CNN and LSTM models. They use a convolutional neural network (CNN) to extract video content features from saliency maps or original images, and use LSTM to extract motion patterns from history rotations.…”
Section: B Viewport Predictionmentioning
confidence: 99%