International audienceTraditionally, audio quality and video quality are evaluated separately in subjective tests. Best practices within the quality assessment community were developed before many modern mobile audiovisual devices and services came into use, such as internet video, smart phones, tablets and connected televisions. These devices and services raise unique questions that require jointly evaluating both the audio and the video within a subjective test. However, audiovisual subjective testing is a relatively under-explored field. In this paper, we address the question of determining the most suitable way to conduct audiovisual subjective testing on a wide range of audiovisual quality. Six laboratories from four countries conducted a systematic study of audiovisual subjective testing. The stimuli and scale were held constant across experiments and labs; only the environment of the subjective test was varied. Some subjective tests were conducted in controlled environments and some in public environments (a cafeteria, patio or hallway). The audiovisual stimuli spanned a wide range of quality. Results show that these audiovisual subjective tests were highly repeatable from one laboratory and environment to the next. The number of subjects was the most important factor. Based on this experiment, 24 or more subjects are recommended for Absolute Category Rating (ACR) tests. In public environments, 35 subjects were required to obtain the same Student.s t-test sensitivity. The second most important variable was individual differences between subjects. Other environmental factors had minimal impact, such as language, country, lighting, background noise, wall color, and monitor calibration. Analyses indicate that Mean Opinion Scores (MOS) are relative rather than absolute. Our analyses show that the results of experiments done in pristine, laboratory environments are highly representative of those devices in actual use, in a typical user environment
The paper presents a series of three new video quality model standards for the assessment of sequences of up to UHD/4K resolution. They were developed in a competition within the International Telecommunication Union (ITU-T), Study Group 12, in collaboration with the Video Quality Experts Group (VQEG), over a period of more than two years. A large video quality test set with a total of 26 individual databases was created, with 13 used for training and 13 for validation and selection of the winning models. For each database, video quality laboratory tests were run with at least 24 subjects each. The 5-point Absolute Category Rating (ACR) scale was used for rating, calculating Mean Opinion Scores (MOS) as ground-truth. To represent today's commonly applied HTTP-based adaptive streaming context, the test sequences comprise a variety of encoding settings, bitrates, resolutions and framerates for the three codecs H.264/AVC, H.265/HEVC and VP9, applied to a wide range of source sequences of around 8 s duration. Processing was carried out with an FFmpeg-based processing chain developed specifically for the competition, and via upload and encoding through exemplary online streaming services. The resulting data represents the largest, lab-test-based dataset used for video quality model development to date, with a total of around 5,000 test sequences. The paper addresses the three models ultimately standardized in the P.1204 Recommendation series, resulting in different model types and for different applications: (i) Rec. P.1204.3, no-reference bitstream-based, with access to encoded bitstream information; (ii) P.1204.4, pixel-based, using information from the reference and the processed video, and (iii) P.1204.5, no-reference hybrid, using both bitstream-and pixel-information without knowledge of the reference. The paper outlines the development process and provides holistic details about the statistical evaluation, test databases, model algorithms and validation results, as well as a performance comparison with state-of-the-art models.INDEX TERMS bitstream, full reference, http adaptive streaming (HAS), hybrid, pixel-based, QoE, reduced reference, video quality.
No abstract
International audienceIn 2011, the Video Quality Experts Group (VQEG) ran subjects through the same audiovisual subjective test at six different international laboratories. That small dataset is now publically available for research and development purposes
The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Tan Lee.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.