This paper proposes a novel approach for visually anonymizing video clips while retaining the ability to machine-based analysis of the video clip, such as, human action recognition. The visual anonymization is achieved by proposing a novel method for generating the anonymization silhouette by modelling the frame-wise temporal visual salience. This is followed by analysing these temporal saliencebased silhouettes by extracting the proposed histograms of gradients in salience (HOG-S) for learning the action representation in the visually anonymized domain. Since the anonymization maps are based on the temporal salience maps represented in gray scale, only the moving body parts related to the motion of the action are represented in larger gray values forming highly anonymized silhouettes, resulting in the highest mean anonymity score (MAS), the least identifiable visual appearance attributes and a high utility of human-perceived utility in action recognition. In terms of machine-based human action recognition, using the proposed HOGS features has resulted in the highest accuracy rate in the anonymized domain compared to those achieved from the existing anonymization methods. Overall, the proposed holistic human action recognition method, i.e., the temporal salience modelling followed by the HOGS feature extraction, has resulted in the best human action recognition accuracy rates for datasets DHA, KTH, UIUC1, UCF Sports and HMDB51 with improvements of 3%, 1.6%, 0.8%, 1.3% and 16.7%, respectively. The proposed method outperforms both feature-based and deep learning based existing approaches. INDEX TERMS Visual anonymization, Human action recognition, histogram of gradients in salience (HOG-S), temporal visual salience estimation, privacy, video-based monitoring, assisted living.