Remote Heart Rate Measurement From Highly Compressed Facial Videos: An End-to-End Deep Learning Solution With Video Enhancement

Yu, Zitong; Peng, Wei; Li, Xiaobai; Hong, Xiaopeng; Zhao, Guoying

doi:10.1109/iccv.2019.00024

Cited by 229 publications

(206 citation statements)

References 35 publications

Supporting

Mentioning

204

Contrasting

Order By: Relevance

“…PhysNet [11] is a 3D CNN which learns the temporal and spatial context features of face sequences simultaneously to extract the PPG signal and then measures the heart rate variability from the PPG signal. Yu [12] proposed a two-stage network that first enhances the quality of compressed videos and then retrieves the PPG signal from the enhanced videos. The main contribution of [12] is using CNN to enhance the compressed videos, which is not the focus of this paper.…”

mentioning

confidence: 99%

Analysis of CNN-based remote-PPG to understand limitations and sensitivities

Zhan¹,

Wang²,

Haan³

2020

Biomed. Opt. Express

View full text Add to dashboard Cite

Deep learning based on Convolutional Neural Network (CNN) has shown promising results in various vision-based applications, recently also in camera-based vital signs monitoring. The CNN-based Photoplethysmography (PPG) extraction has, so far, been focused on performance rather than understanding. In this paper, we try to answer four questions with experiments aiming at improving our understanding of this methodology as it gains popularity. We conclude that the network exploits the blood absorption variation to extract the physiological signals, and that the choice and parameters (phase, spectral content, etc.) of the reference-signal may be more critical than anticipated. The availability of multiple convolutional kernels is necessary for CNN to arrive at a flexible channel combination through the spatial operation, but may not provide the same motion-robustness as a multi-site measurement using knowledge-based PPG extraction. Finally, we conclude that the PPG-related prior knowledge is still helpful for the CNN-based PPG extraction. Consequently, we recommend further investigation of hybrid CNN-based methods to include prior knowledge in their design. IntroductionRemote Photoplethysmography (remote-PPG) is a contactless way to measure human cardiovascular activity by measuring the reflection variations of the skin registered by a video camera [1]. Over the last decade, various remote-PPG methods [2-7] have been proposed for PPG-signal extraction. The methods differ in their choice of assumptions [2-5] and use of handcrafted features (e.g. projected color features of CHROM [4] and POS [5]), while these choices affect their robustness with respect to illumination variations and subject motion. Related workRecently, the success of deep Convolutional Neural Network (CNN) methods that automatically learn relevant features from images/videos in various applications has inspired researchers to attempt CNN-based remote-PPG extraction [8][9][10][11][12]. Chen and McDuff [8] proposed a convolutional attention network consisting of two parallel models to extract the PPG signal from a video. The first model is a classical "appearance model" [13] that learns to find the skin regionof-interest (RoI), while the second parallel path fed with DC-normalized frame-differences from the RoI learns to extract the PPG signal, using a finger oximeter-derived signal as a reference. In [8], the second model is referred to as a "motion model", but we prefer to use the term "normalized frame difference model", since our work shall prove that it exploits the blood absorption variation rather than the skin motion as suggested by [8]. SynRhythm [9] is a general-to-specific transfer learning method. The authors directly convert the spatial-temporal features into heart rate based on the pre-trained network [14]. HR-CNN [10] consists of the extractor CNN and the HR-estimator CNN with different loss functions to predict the heart rate, rather than the PPG signal. PhysNet [11] is a 3D CNN which learns the temporal and spatial context features of f...

show abstract

mentioning

confidence: 99%

Analysis of CNN-based remote-PPG to understand limitations and sensitivities

Zhan¹,

Wang²,

Haan³

2020

Biomed. Opt. Express

View full text Add to dashboard Cite

show abstract

“…Face spoof detection is similar to DeepFake detection, aiming to determine whether a video contains a live face. Since the remote heart rhythm (HR) measuring techniques achieve quite a bit of progress [2,6,35,39,54,55,69], many works use rPPG for face spoofing detection. For example, Li et al [40] use the pulse difference between real and printed faces to defend spoofing attacks.…”

Section: Remote Photoplethysmography (Rppg)mentioning

confidence: 99%

“…In this work, we present DeepRhythm, a novel DeepFake detection technique that is intuitively motivated and is designed from ground up with first principles in mind. Motivated by the fact that remote visual photoplethysmography (PPG) [69] is made possible by monitoring the minuscule periodic changes of skin color due to blood pumping through the face from a video, we conjecture that normal heartbeat rhythms found in the real face videos will be disrupted or even broken entirely in a DeepFake video, making it a powerful indicator for detecting DeepFakes. As shown in Figure 1, existing manipulations, e.g., DeepFakes, significantly change the sequential signals of the real video, which contains the primary information of the heartbeat rhythm.…”

Section: Introductionmentioning

confidence: 99%

DeepRhythm

Hua

Guo

Juefei-Xu

et al. 2020

Proceedings of the 28th ACM International Conference on Multimedia

131

View full text Add to dashboard Cite

As the GAN-based face image and video generation techniques, widely known as DeepFakes, have become more and more matured and realistic, there comes a pressing and urgent demand for effective DeepFakes detectors. Motivated by the fact that remote visual photoplethysmography (PPG) is made possible by monitoring the minuscule periodic changes of skin color due to blood pumping through the face, we conjecture that normal heartbeat rhythms found in the real face videos will be disrupted or even entirely broken in a DeepFake video, making it a potentially powerful indicator for DeepFake detection. In this work, we propose DeepRhythm, a DeepFake detection technique that exposes Deep-Fakes by monitoring the heartbeat rhythms. DeepRhythm utilizes dual-spatial-temporal attention to adapt to dynamically changing face and fake types. Extensive experiments on FaceForensics++ and DFDC-preview datasets have confirmed our conjecture and demonstrated not only the effectiveness, but also the generalization capability of DeepRhythm over different datasets by various DeepFakes generation techniques and multifarious challenging degradations. CCS CONCEPTS • Computing methodologies → Computer vision; • Security and privacy → Social aspects of security and privacy.

show abstract

“…Recently, a growing number of physiological measurement techniques have been proposed based on remote photoplethysmography (rPPG) signals, which could be captured from face by ordinary cameras and without any contact. These techniques make it possible to measure HR, RF and HRV remotely and have been developed rapidly [15,16,3,9,19,22,13,25].…”

Section: Introductionmentioning

confidence: 99%

“…Besides the hand-crafted traditional methods, there are also some approaches which adopt the strong modeling ability of deep neural networks for remote physiological signal estimation [17,2,13,25]. Most of these methods focus on learning a network mapping from different hand-crafted representations of face videos (e.g., cropped video frames [17,25], motion representation [2] or spatial-temporal map [13]) to the physiological signals. One fact is that, these hand-crafted representations contain not only the information of physiological signals, but also the non-physiological information such as head movements, lighting variations and device noises.…”

Section: Introductionmentioning

confidence: 99%

Video-Based Remote Physiological Measurement via Cross-Verified Feature Disentangling

Niu

Han

et al. 2020

Lecture Notes in Computer Science

Self Cite

105

View full text Add to dashboard Cite

Remote physiological measurements, e.g., remote photoplethysmography (rPPG) based heart rate (HR), heart rate variability (HRV) and respiration frequency (RF) measuring, are playing more and more important roles under the application scenarios where contact measurement is inconvenient or impossible. Since the amplitude of the physiological signals is very small, they can be easily affected by head movements, lighting conditions, and sensor diversities. To address these challenges, we propose a cross-verified feature disentangling strategy to disentangle the physiological features with non-physiological representations, and then use the distilled physiological features for robust multi-task physiological measurements. We first transform the input face videos into a multi-scale spatial-temporal map (MSTmap), which can suppress the irrelevant background and noise features while retaining most of the temporal characteristics of the periodic physiological signals. Then we take pairwise MSTmaps as inputs to an autoencoder architecture with two encoders (one for physiological signals and the other for non-physiological information) and use a cross-verified scheme to obtain physiological features disentangled with the non-physiological features. The disentangled features are finally used for the joint prediction of multiple physiological signals like average HR values and rPPG signals. Comprehensive experiments on different large-scale public datasets of multiple physiological measurement tasks as well as the cross-database testing demonstrate the robustness of our approach.

show abstract

Remote Heart Rate Measurement From Highly Compressed Facial Videos: An End-to-End Deep Learning Solution With Video Enhancement

Cited by 229 publications

References 35 publications

Analysis of CNN-based remote-PPG to understand limitations and sensitivities

Analysis of CNN-based remote-PPG to understand limitations and sensitivities

DeepRhythm

Video-Based Remote Physiological Measurement via Cross-Verified Feature Disentangling

Contact Info

Product

Resources

About