A brain-computer interface (BCI) is a technology that uses neural features to restore or augment the capabilities of its user. A BCI for speech would enable communication in real time via neural correlates of attempted or imagined speech. Such a technology would potentially restore communication and improve quality of life for locked-in patients and other patients with severe communication disorders. There have been many recent developments in neural decoders, neural feature extraction, and brain recording modalities facilitating BCI for the control of prosthetics and in automatic speech recognition (ASR). Indeed, ASR and related fields have developed significantly over the past years, and many lend many insights into the requirements, goals, and strategies for speech BCI. Neural speech decoding is a comparatively new field but has shown much promise with recent studies demonstrating semantic, auditory, and articulatory decoding using electrocorticography (ECoG) and other neural recording modalities. Because the neural representations for speech and language are widely distributed over cortical regions spanning the frontal, parietal, and temporal lobes, the mesoscopic scale of population activity captured by ECoG surface electrode arrays may have distinct advantages for speech BCI, in contrast to the advantages of microelectrode arrays for upper-limb BCI. Nevertheless, there remain many challenges for the translation of speech BCIs to clinical populations. This review discusses and outlines the current stateof-the-art for speech BCI and explores what a speech BCI using chronic ECoG might entail.
Background The selection of optimal deep brain stimulation (DBS) parameters is time‐consuming, experience‐dependent, and best suited when acute effects of stimulation can be observed (e.g., tremor reduction). Objectives To test the hypothesis that optimal stimulation location can be estimated based on the cortical connections of DBS contacts. Methods We analyzed a cohort of 38 patients with Parkinson's disease (24 training, and 14 test cohort). Using whole‐brain probabilistic tractography, we first mapped the cortical regions associated with stimulation‐induced efficacy (rigidity, bradykinesia, and tremor improvement) and side effects (paresthesia, motor contractions, and visual disturbances). We then trained a support vector machine classifier to categorize DBS contacts into efficacious, defined by a therapeutic window ≥2 V (threshold for side effect minus threshold for efficacy), based on their connections with cortical regions associated with efficacy versus side effects. The connectivity‐based classifications were then compared with actual stimulation contacts using receiver‐operating characteristics (ROC) curves. Results Unique cortical clusters were associated with stimulation‐induced efficacy and side effects. In the training dataset, 42 of the 47 stimulation contacts were accurately classified as efficacious, with a therapeutic window of ≥3 V in 31 (66%) and between 2 and 2.9 V in 11 (24%) electrodes. This connectivity‐based estimation was successfully replicated in the test cohort with similar accuracy (area under ROC = 0.83). Conclusions Cortical connections can predict the efficacy of DBS contacts and potentially facilitate DBS programming. The clinical utility of this paradigm in optimizing DBS outcomes should be prospectively tested, especially for directional electrodes.
Neural keyword spotting could form the basis of a speech brain-computer-interface for menu-navigation if it can be done with low latency and high specificity comparable to the “wake-word” functionality of modern voice-activated AI assistant technologies. This study investigated neural keyword spotting using motor representations of speech via invasively-recorded electrocorticographic signals as a proof-of-concept. Neural matched filters were created from monosyllabic consonant-vowel utterances: one keyword utterance, and 11 similar non-keyword utterances. These filters were used in an analog to the acoustic keyword spotting problem, applied for the first time to neural data. The filter templates were cross-correlated with the neural signal, capturing temporal dynamics of neural activation across cortical sites. Neural vocal activity detection (VAD) was used to identify utterance times and a discriminative classifier was used to determine if these utterances were the keyword or non-keyword speech. Model performance appeared to be highly related to electrode placement and spatial density. Vowel height (/a/ vs /i/) was poorly discriminated in recordings from sensorimotor cortex, but was highly discriminable using neural features from superior temporal gyrus during self-monitoring. The best performing neural keyword detection (5 keyword detections with two false-positives across 60 utterances) and neural VAD (100% sensitivity, ~1 false detection per 10 utterances) came from high-density (2 mm electrode diameter and 5 mm pitch) recordings from ventral sensorimotor cortex, suggesting the spatial fidelity and extent of high-density ECoG arrays may be sufficient for the purpose of speech brain-computer-interfaces.
Recent studies have shown that speech can be reconstructed and synthesized using only brain activity recorded with intracranial electrodes, but until now this has only been done using retrospective analyses of recordings from able-bodied patients temporarily implanted with electrodes for epilepsy surgery. Here, we report online synthesis of intelligible words using a chronically implanted brain-computer interface (BCI) in a clinical trial participant (ClinicalTrials.gov,NCT03567213) with dysarthria due to amyotrophic lateral sclerosis (ALS). We demonstrate a reliable BCI that synthesizes commands freely chosen and spoken by the user from a vocabulary of 6 keywords originally designed to allow intuitive selection of items on a communication board. Our results show for the first time that a speech-impaired individual with ALS can use a chronically implanted BCI to reliably produce synthesized words that are intelligible to human listeners while preserving the participants voice profile.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.