“…The overall video quality is then revised by processing viewports with the predicted saliency map. Instead of handling the entire video, Qian et al [38] created a video bag by grouping a specific number of video blocks, which are extracted features to predict the overall quality of the video. However, constructing a 360°video dataset requires a lot of time and resources, and questions such as "how long a video should be to be efficient for evaluating the quality", "in what proper way we should change quality in video adaptively", etc.…”