ClassX is an interactive online lecture viewing system developed at Stanford University. Unlike existing solutions that restrict the user to watch only a pre-defined view, ClassX allows interactive pan/tilt/zoom while watching the video. The interactive video streaming paradigm avoids sending the entire field-of-view in the recorded high resolution, thus reducing the required data rate. To alleviate the navigation burden on the part of the online viewer, ClassX offers automatic tracking of the lecturer. ClassX also employs slide recognition technology, which allows automatic synchronization of digital presentation slides with those appearing in the lecture video. This paper presents a design overview of the ClassX system and the evaluation results of a 3-month pilot deployment at Stanford University. The results demonstrate that our system is a low-cost, efficient and pragmatic solution to interactive online lecture viewing.
This report proposes a new stochastic model of visual attention to predict the likelihood of where humans typically focus on a video scene. The proposed model is composed of a dynamic Bayesian network that simulates and combines a person's visual saliency response and eye movement patterns to estimate the most probable regions of attention. Dynamic Markov random field (MRF) models are newly introduced to include spatiotemporal relationships of visual saliency responses. Experimental results have revealed that the propose model outperforms the previous deterministic model and the stochastic model without dynamic MRF in predicting human visual attention.
Video streaming usage has seen a significant rise as entertainment, education, and business increasingly rely on online video. Optimizing video compression has the potential to increase access and quality of content to users, and reduce energy use and costs overall. In this paper, we present an application of the MuZero algorithm to the challenge of video compression. Specifically, we target the problem of learning a rate control policy to select the quantization parameters (QP) in the encoding process of libvpx, an open source VP9 video compression library widely used by popular video-on-demand (VOD) services. We treat this as a sequential decision making problem to maximize the video quality with an episodic constraint imposed by the target bitrate. Notably, we introduce a novel self-competition based reward mechanism to solve constrained RL with variable constraint satisfaction difficulty, which is challenging for existing constrained RL methods. We demonstrate that the MuZero-based rate control achieves an average 6.28% reduction in size of the compressed videos for the same delivered video quality level (measured as PSNR BD-rate) compared to libvpx's two-pass VBR rate control policy, while having better constraint satisfaction behavior.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.