Debadatta Dash scite author profile

Speech production is a hierarchical mechanism involving the synchronization of the brain and the oral articulators, where the intention of linguistic concepts is transformed into meaningful sounds. Individuals with locked-in syndrome (fully paralyzed but aware) lose their motor ability completely including articulation and even eyeball movement. The neural pathway may be the only option to resume a certain level of communication for these patients. Current brain-computer interfaces (BCIs) use patients' visual and attentional correlates to build communication, resulting in a slow communication rate (a few words per minute). Direct decoding of imagined speech from the neural signals (and then driving a speech synthesizer) has the potential for a higher communication rate. In this study, we investigated the decoding of five imagined and spoken phrases from single-trial, non-invasive magnetoencephalography (MEG) signals collected from eight adult subjects. Two machine learning algorithms were used. One was an artificial neural network (ANN) with statistical features as the baseline approach. The other was convolutional neural networks (CNNs) applied on the spatial, spectral and temporal features extracted from the MEG signals. Experimental results indicated the possibility to decode imagined and spoken phrases directly from neuromagnetic signals. CNNs were found to be highly effective with an average decoding accuracy of up to 93% for the imagined and 96% for the spoken phrases.

show abstract

Overt Speech Retrieval From Neuromagnetic Signals Using Wavelets and Artificial Neural Networks

Dash¹,

Ferrari

Malik

et al. 2018

View full text Add to dashboard Cite

Towards a Speaker Independent Speech-BCI Using Speaker Adaptation

Dash

Wisler

Ferrari

et al. 2019

View full text Add to dashboard Cite

Neurodegenerative diseases such as amyotrophic lateral sclerosis (ALS) can cause locked-in-syndrome (fully paralyzed but aware). Brain-computer interface (BCI) may be the only option to restore their communication. Current BCIs typically use visual or attention correlates in neural activities to select letters randomly displayed on a screen, which are extremely slow (a few words per minute). Speech-BCIs, which aim to convert the brain activity patterns to speech (neural speech decoding), hold the potential to enable faster communication. Although a few recent studies have shown the potential of neural speech decoding, those are focused on speaker-dependent models. In this study, we investigated speaker-independent neural speech decoding of five continuous phrases from Magnetoencephalography (MEG) signals while 8 subjects produced speech covertly (imagination) or overtly (articulation). We have used both supervised and unsupervised speaker adaptation strategies for implementing a speaker independent model. Experimental results demonstrated that the proposed adaptation-based speakerindependent model has significantly improved decoding performance. To our knowledge, this is the first demonstration of the possibility of speaker-independent neural speech decoding.

show abstract

Determining the Optimal Number of MEG Trials: A Machine Learning and Speech Decoding Perspective

Dash

Ferrari

Malik

et al. 2018

View full text Add to dashboard Cite

Advancing the knowledge about neural speech mechanisms is critical for developing next-generation, faster brain computer interface to assist in speech communication for the patients with severe neurological conditions (e.g., locked-in syndrome). Among current neuroimaging techniques, Magnetoencephalography (MEG) provides direct representation for the large-scale neural dynamics of underlying cognitive processes based on its optimal spatiotemporal resolution. However, the MEG measured neural signals are smaller in magnitude compared to the background noise and hence, MEG usually suffers from a low signal-to-noise ratio (SNR) at the single-trial level. To overcome this limitation, it is common to record many trials of the same event-task and use the time-locked average signal for analysis, which can be very time consuming. In this study, we investigated the effect of the number of MEG recording trials required for speech decoding using a machine learning algorithm. We used a wavelet filter for generating the denoised neural features to train an Artificial Neural Network (ANN) for speech decoding. We found that wavelet based denoising increased the SNR of the neural signal prior to analysis and facilitated accurate speech decoding performance using as few as 40 single-trials. This study may open up the possibility of limiting MEG trials for other task evoked studies as well.

show abstract

MEG Sensor Selection for Neural Speech Decoding

et al. 2020

View full text Add to dashboard Cite

Direct decoding of speech from the brain is a faster alternative to current electroencephalography (EEG) speller-based brain-computer interfaces (BCI) in providing communication assistance to locked-in patients. Magnetoencephalography (MEG) has recently shown great potential as a non-invasive neuroimaging modality for neural speech decoding, owing in part to its spatial selectivity over other high-temporal resolution devices. Standard MEG systems have a large number of cryogenically cooled channels/sensors (200 − 300) encapsulated within a fixed liquid helium dewar, precluding their use as wearable BCI devices. Fortunately, recently developed optically pumped magnetometers (OPM) do not require cryogens, and have the potential to be wearable and movable making them more suitable for BCI applications. This design is also modular allowing for customized montages to include only the sensors necessary for a particular task. As the number of sensors bears a heavy influence on the cost, size, and weight of MEG systems, minimizing the number of sensors is critical for designing practical MEG-based BCIs in the future. In this study, we sought to identify an optimal set of MEG channels to decode imagined and spoken phrases from the MEG signals. Using a forward selection algorithm with a support vector machine classifier we found that nine optimally located MEG gradiometers provided higher decoding accuracy compared to using all channels. Additionally, the forward selection algorithm achieved similar performance to dimensionality reduction using a stacked-sparse-autoencoder. Analysis of spatial dynamics of speech decoding suggested that both left and right hemisphere sensors contribute to speech decoding. Sensors approximately located near Broca's area were found to be commonly contributing among the higher-ranked sensors across all subjects. INDEX TERMS autoencoder, brain-computer interface, forward selection algorithm, magnetoencephalography, neural speech decoding, OPM, SVM

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Debadatta Dash

Decoding Imagined and Spoken Phrases From Non-invasive Neural (MEG) Signals

Overt Speech Retrieval From Neuromagnetic Signals Using Wavelets and Artificial Neural Networks

Towards a Speaker Independent Speech-BCI Using Speaker Adaptation

Determining the Optimal Number of MEG Trials: A Machine Learning and Speech Decoding Perspective

MEG Sensor Selection for Neural Speech Decoding

Contact Info

Product

Resources

About