As part of a perennial project, our team is actively engaged in developing new synthetic assistant (SA) technologies to assist in training combat medics and medical first responders. It is critical that medical first responders be well trained to deal with emergencies more effectively. This would require real-time monitoring and feedback for each trainee. Therefore, we introduced a voice-based SA to augment the training process of medical first responders and enhance their performance in the field. The potential benefits of SAs include a reduction in training costs and enhanced monitoring mechanisms. Despite the increased usage of voice-based personal assistants (PAs) in day-today life, the associated effects are commonly neglected for a study of human factors. Therefore, this paper focuses on performance analysis of the developed voice-based SA in emergency care provider training for a selected emergency treatment scenario. The research discussed in this paper follows design science in developing proposed technology; at length, we discussed architecture and development and presented working results of voice-based SA. The empirical testing was conducted on two groups as user study using statistical analysis tools, one trained with conventional methods and the other with the help of SA. The statistical results demonstrated the amplification in training efficacy and performance of medical responders powered by SA. Furthermore, the paper also discusses the accuracy and time of task execution (t) and concludes with the guidelines for resolving the identified problems.
In this paper, we present a novel pipelined near real-time speaker recognition architecture that enhances the performance of speaker recognition by exploiting the advantages of hybrid feature extraction techniques that contain the features of Gabor Filter (GF), Convolution Neural Networks (CNN), and statistical parameters as a single matrix set. This architecture has been developed to enable secure access to a voice-based user interface (UI) by enabling speaker-based authentication and integration with an existing Natural Language Processing (NLP) system. Gaining secure access to existing NLP systems also served as motivation. Initially, we identify challenges related to real-time speaker recognition and highlight the recent research in the field. Further, we analyze the functional requirements of a speaker recognition system and introduce the mechanisms that can address these requirements through our novel architecture. Subsequently, the paper discusses the effect of different techniques such as CNN, GF, and statistical parameters in feature extraction. For the classification, standard classifiers such as Support Vector Machine (SVM), Random Forest (RF) and Deep Neural Network (DNN) are investigated. To verify the validity and effectiveness of the proposed architecture, we compared different parameters including accuracy, sensitivity, and specificity with the standard AlexNet architecture.
At the time of writing this paper, the world has around eleven million cases of COVID-19, scientifically known as severe acute respiratory syndrome corona-virus 2 (SARS-COV-2). One of the popular critical steps various health organizations are advocating to prevent the spread of this contagious disease is self-assessment of symptoms. Multiple organizations have already pioneered mobile and web-based applications for self-assessment of COVID-19 to reduce the spread of this global pandemic. We propose an intelligent voice-based assistant for COVID-19 selfassessment (IVACS). This interactive assistant has been built to diagnose the symptoms related to COVID-19 using the guidelines provided by the Centers for Disease Control and Prevention (CDC) and the World Health Organization (WHO). The empirical testing of the application has been performed with 22 human subjects, all volunteers, using the NASA Task Load Index (TLX), and subjects' performance accuracy has been measured. The results indicate that the IVACS is beneficial to users. However, it still needs additional research and development to promote its widespread application.
Multimodal human–computer interaction (HCI) systems pledge a more human–human-like interaction between machines and humans. Their prowess in emanating an unambiguous information exchange between the two makes these systems more reliable, efficient, less error prone, and capable of solving complex tasks. Emotion recognition is a realm of HCI that follows multimodality to achieve accurate and natural results. The prodigious use of affective identification in e-learning, marketing, security, health sciences, etc., has increased demand for high-precision emotion recognition systems. Machine learning (ML) is getting its feet wet to ameliorate the process by tweaking the architectures or wielding high-quality databases (DB). This paper presents a survey of such DBs that are being used to develop multimodal emotion recognition (MER) systems. The survey illustrates the DBs that contain multi-channel data, such as facial expressions, speech, physiological signals, body movements, gestures, and lexical features. Few unimodal DBs are also discussed that work in conjunction with other DBs for affect recognition. Further, VIRI, a new DB of visible and infrared (IR) images of subjects expressing five emotions in an uncontrolled, real-world environment, is presented. A rationale for the superiority of the presented corpus over the existing ones is instituted.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.