Detecting representative frames in videos based on human actions is quite challenging because of the combined factors of human pose in action and the background. This paper proposed a key frame extraction algorithm based on dynamic spatio-temporal slice clustering. This algorithm firstly uses the dynamic spatio-temporal slice position selection method based on the human mask heat map to calculate the position of slice to realize the dynamic selection of slice positions, then complete the extraction of spatio-temporal slice images. After clustering the spatio-temporal slice images, this method extracts key frames according to the clustering results. The experimental results prove the validity of spatio-temporal slice location selection method, the proposed algorithm can effectively solve the problems of information redundancy and key information missing in existing methods. We conduct experiments on a challenging human action dataset UCF101 and show that our method can detect key frames with high accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.