This paper focuses on real-time pedestrian detection onField Programmable Gate Arrays (FPGAs) using the Histograms of Oriented Gradients (HOG) descriptor in combination with a Support Vector Machine (SVM) for classification as a basic method. We propose to process image data at twice the pixel frequency and to normalize blocks with the L1-Sqrt-norm resulting in an efficient resource utilization. This implementation allows for parallel computation of different scales. Combined with a time-multiplex approach we increase multiscale capabilities beyond resource limitations. We are able to process 64 high resolution images (1920 × 1080 pixels) per second at 18 scales with a latency of less than 150 μs. 1.79 million HOG descriptors and their SVM classifications can be calculated per second and per scale, which outperforms current FPGA implementations by a factor of 4.
Prior work on automated methods demonstrated that it is possible to recognize pain intensity from frontal faces in videos, while there is an assumption that humans are very adept at this task compared to machines. In this paper, we investigate whether such an assumption is correct by comparing the results achieved by two human observers with the results achieved by a Random Forest classifier (RFc) baseline model (called RFc-BL) and by three proposed automated models. The first proposed model is a Random Forest classifying descriptors of Action Unit (AU) time series; the second is a modified MobileNetV2 CNN classifying face images that combine three points in time; and the third is a custom deep network combining two CNN branches using the same input as for MobileNetV2 plus knowledge of the RFc. We conduct experiments with X-ITE phasic pain database, which comprises videotaped responses to heat and electrical pain stimuli, each of three intensities. Distinguishing these six stimulation types plus no stimulation was the main 7-class classification task for the human observers and automated approaches. Further, we conducted reduced 5-class and 3-class classification experiments, applied Multi-task learning, and a newly suggested sample weighting method. Experimental results show that the pain assessments of the human observers are significantly better than guessing and perform better than the automatic baseline approach (RFc-BL) by about 1%; however, the human performance is quite poor due to the challenge that pain that is ethically allowed to be induced in experimental studies often does not show up in facial reaction. We discovered that downweighting those samples during training improves the performance for all samples. The proposed RFc and two-CNNs models (using the proposed sample weighting) significantly outperformed the human observer by about 6% and 7%, respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.