This work presents a human activity recognition (HAR) model based on audio features. The use of sound as an information source for HAR models represents a challenge because sound wave analyses generate very large amounts of data. However, feature selection techniques may reduce the amount of data required to represent an audio signal sample. Some of the audio features that were analyzed include Mel-frequency cepstral coefficients (MFCC). Although MFCC are commonly used in voice and instrument recognition, their utility within HAR models is yet to be confirmed, and this work validates their usefulness. Additionally, statistical features were extracted from the audio samples to generate the proposed HAR model. The size of the information is necessary to conform a HAR model impact directly on the accuracy of the model. This problem also was tackled in the present work; our results indicate that we are capable of recognizing a human activity with an accuracy of 85% using the HAR model proposed. This means that minimum computational costs are needed, thus allowing portable devices to identify human activities using audio as an information source.
The effects of distracted driving are one of the main causes of deaths and injuries on U.S. roads. According to the National Highway Traffic Safety Administration (NHTSA), among the different types of distractions, the use of cellphones is highly related to car accidents, commonly known as “texting and driving”, with around 481,000 drivers distracted by their cellphones while driving, about 3450 people killed and 391,000 injured in car accidents involving distracted drivers in 2016 alone. Therefore, in this research, a novel methodology to detect distracted drivers using their cellphone is proposed. For this, a ceiling mounted wide angle camera coupled to a deep learning–convolutional neural network (CNN) are implemented to detect such distracted drivers. The CNN is constructed by the Inception V3 deep neural network, being trained to detect “texting and driving” subjects. The final CNN was trained and validated on a dataset of 85,401 images, achieving an area under the curve (AUC) of 0.891 in the training set, an AUC of 0.86 on a blind test and a sensitivity value of 0.97 on the blind test. In this research, for the first time, a CNN is used to detect the problem of texting and driving, achieving a significant performance. The proposed methodology can be incorporated into a smart infotainment car, thus helping raise drivers’ awareness of their driving habits and associated risks, thus helping to reduce careless driving and promoting safe driving practices to reduce the accident rate.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.