Updating the Silent Speech Challenge benchmark with deep learning

Ji, Yan; Liu, Licheng; Wang, Hongcui; Liu, Zhilei; Niu, Zhibin; Denby, B.

doi:10.1016/j.specom.2018.02.002

Cited by 53 publications

(45 citation statements)

References 30 publications

(32 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As the area of speech technology such as speech recognition and speech synthesis using deep learning has become wider, recent studies are attempting to solve the issue of articulatory-to-acoustic conversion [ 76 ]. In implementing SSI or silent speech recognition (SSR) technologies, such as sensor handling, interference, and feature extraction, using deep learning are also increasing to improve recognition performance [ 7 ]. Recently, DNN has been conducted more frequently than traditional systems, such as Gaussian mixture model (GMM) in speech recognition research, and CNN is also widely used because it proved to be effective in recognizing patterns in the speech signal and image processing [ 7 ].…”

Section: Deep Learning Based Voice Recognitionmentioning

confidence: 99%

“…Thus, a novel concept must be developed for voice recognition and production technologies which also can include brain-computer interfaces (BCIs) and silent-speech interfaces (SSIs). SSI is considered as a plausible approach to producing natural-sounding speech by capturing biosignals from the articulators, neural pathways, or the brain itself in brain-computer interfaces (BCIs) [ 5 , 6 , 7 , 8 ]. Recently, various biosignals captured by techniques such as ultrasound, optical imagery, EPG, EEG, and surface electromyography (sEMG) have been investigated in terms of developing silent speech communication systems [ 8 , 9 , 10 ].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Biosignal Sensors and Deep Learning-Based Speech Recognition: A Review

Lee

Seong

Ozlu

et al. 2021

Sensors

View full text Add to dashboard Cite

Voice is one of the essential mechanisms for communicating and expressing one’s intentions as a human being. There are several causes of voice inability, including disease, accident, vocal abuse, medical surgery, ageing, and environmental pollution, and the risk of voice loss continues to increase. Novel approaches should have been developed for speech recognition and production because that would seriously undermine the quality of life and sometimes leads to isolation from society. In this review, we survey mouth interface technologies which are mouth-mounted devices for speech recognition, production, and volitional control, and the corresponding research to develop artificial mouth technologies based on various sensors, including electromyography (EMG), electroencephalography (EEG), electropalatography (EPG), electromagnetic articulography (EMA), permanent magnet articulography (PMA), gyros, images and 3-axial magnetic sensors, especially with deep learning techniques. We especially research various deep learning technologies related to voice recognition, including visual speech recognition, silent speech interface, and analyze its flow, and systematize them into a taxonomy. Finally, we discuss methods to solve the communication problems of people with disabilities in speaking and future research with respect to deep learning components.

show abstract

Section: Deep Learning Based Voice Recognitionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Biosignal Sensors and Deep Learning-Based Speech Recognition: A Review

Lee

Seong

Ozlu

et al. 2021

Sensors

View full text Add to dashboard Cite

show abstract

“…This model achieved a recognition accuracy of 80.4% when tested over the database developed in [106], which validated it for visual speech recognition. Deep autoencoders were used in [273], [274] to extract features from ultrasound images, achieving significant gains in both silent ASR and direct synthesis. In [275], multitask learning of speech recognition and synthesis parameters was evaluated in the context of an ultrasoundbased SSI system designed to enhance the performance of individual tasks.…”

Section: ) Imaging Techniquesmentioning

confidence: 99%

Silent Speech Interfaces for Speech Restoration: A Review

et al. 2020

View full text Add to dashboard Cite

This review summarises the status of silent speech interface (SSI) research. SSIs rely on non-acoustic biosignals generated by the human body during speech production to enable communication whenever normal verbal communication is not possible or not desirable. In this review, we focus on the first case and present latest SSI research aimed at providing new alternative and augmentative communication methods for persons with severe speech disorders. SSIs can employ a variety of biosignals to enable silent communication, such as electrophysiological recordings of neural activity, electromyographic (EMG) recordings of vocal tract movements or the direct tracking of articulator movements using imaging techniques. Depending on the disorder, some sensing techniques may be better suited than others to capture speech-related information. For instance, EMG and imaging techniques are well suited for laryngectomised patients, whose vocal tract remains almost intact but are unable to speak after the removal of the vocal folds, but fail for severely paralysed individuals. From the biosignals, SSIs decode the intended message, using automatic speech recognition or speech synthesis algorithms. Despite considerable advances in recent years, most present-day SSIs have only been validated in laboratory settings for healthy users. Thus, as discussed in this paper, a number of challenges remain to be addressed in future research before SSIs can be promoted to real-world applications. If these issues can be addressed successfully, future SSIs will improve the lives of persons with severe speech impairments by restoring their communication capabilities.

show abstract

“…The spider monkey optimization algorithm (SMO) is one of the metaheuristic methods [41,[44][45][46]] based on the spider monkey's social behavior, adopting the fission and fusion swarm intelligence tactic for foraging [47]. Spider monkeys usually live in a swarm of 40 to 50 members.…”

Section: Spider Monkey Optimization Algorithmmentioning

confidence: 99%

SMO-DNN: Spider Monkey Optimization and Deep Neural Network Hybrid Classifier Model for Intrusion Detection

et al. 2020

View full text Add to dashboard Cite

The enormous growth in internet usage has led to the development of different malicious software posing serious threats to computer security. The various computational activities carried out over the network have huge chances to be tampered and manipulated and this necessitates the emergence of efficient intrusion detection systems. The network attacks are also dynamic in nature, something which increases the importance of developing appropriate models for classification and predictions. Machine learning (ML) and deep learning algorithms have been prevalent choices in the analysis of intrusion detection systems (IDS) datasets. The issues pertaining to quality and quality of data and the handling of high dimensional data is managed by the use of nature inspired algorithms. The present study uses a NSL-KDD and KDD Cup 99 dataset collected from the Kaggle repository. The dataset was cleansed using the min-max normalization technique and passed through the 1-N encoding method for achieving homogeneity. A spider monkey optimization (SMO) algorithm was used for dimensionality reduction and the reduced dataset was fed into a deep neural network (DNN). The SMO based DNN model generated classification results with 99.4% and 92% accuracy, 99.5%and 92.7% of precision, 99.5% and 92.8% of recall and 99.6%and 92.7% of F1-score, utilizing minimal training time. The model was further compared with principal component analysis (PCA)-based DNN and the classical DNN models, wherein the results justified the advantage of implementing the proposed model over other approaches.

show abstract

Updating the Silent Speech Challenge benchmark with deep learning

Cited by 53 publications

References 30 publications

Biosignal Sensors and Deep Learning-Based Speech Recognition: A Review

Biosignal Sensors and Deep Learning-Based Speech Recognition: A Review

Silent Speech Interfaces for Speech Restoration: A Review

SMO-DNN: Spider Monkey Optimization and Deep Neural Network Hybrid Classifier Model for Intrusion Detection

Contact Info

Product

Resources

About