2022
DOI: 10.48550/arxiv.2207.02595
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

FAST-VQA: Efficient End-to-end Video Quality Assessment with Fragment Sampling

Abstract: Current deep video quality assessment (VQA) methods are usually with high computational costs when evaluating high-resolution videos. This cost hinders them from learning better video-quality-related representations via end-to-end training. Existing approaches usually consider naive sampling to reduce the computational cost, such as resizing and cropping. However, they obviously corrupt quality-related information in videos and are thus not optimal to learn good representations for VQA. Therefore, there is an … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 36 publications
(68 reference statements)
0
1
0
Order By: Relevance
“…RAPIQUE [45] combines and exploits the advantages of both quality-aware scene statistics features and semantics-aware deep convolutional features, designing a general and efficient spatial and temporal bandpass statistical model for VQA. Instead of extracting handcraft features, deep VQA methods [4,19,48,49,56,58] use CNNs to extract rich semantic features and run regression on the extracted features to predict video quality. For example, MLSP-FF [8] extracts frame-wise features with Inception-ResNetv2 model [42] and some works [48,56,58] introduce 3D-CNN instead of 2D-CNN to extract more efficient temporal features.…”
Section: Related Work 21 Video Quality Assessmentmentioning
confidence: 99%
“…RAPIQUE [45] combines and exploits the advantages of both quality-aware scene statistics features and semantics-aware deep convolutional features, designing a general and efficient spatial and temporal bandpass statistical model for VQA. Instead of extracting handcraft features, deep VQA methods [4,19,48,49,56,58] use CNNs to extract rich semantic features and run regression on the extracted features to predict video quality. For example, MLSP-FF [8] extracts frame-wise features with Inception-ResNetv2 model [42] and some works [48,56,58] introduce 3D-CNN instead of 2D-CNN to extract more efficient temporal features.…”
Section: Related Work 21 Video Quality Assessmentmentioning
confidence: 99%