2016
DOI: 10.1088/1741-2560/13/5/056004
|View full text |Cite
|
Sign up to set email alerts
|

Neural speech recognition: continuous phoneme decoding using spatiotemporal representations of human cortical activity

Abstract: The superior temporal gyrus (STG) and neighboring brain regions play a key role in human language processing. Previous studies have attempted to reconstruct speech information from brain activity in the STG, but few of them incorporate the probabilistic framework and engineering methodology used in modern speech recognition systems. In this work, we describe the initial efforts toward the design of a neural speech recognition (NSR) system that performs continuous phoneme recognition on English stimuli with arb… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
76
1

Year Published

2016
2016
2023
2023

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 80 publications
(82 citation statements)
references
References 59 publications
1
76
1
Order By: Relevance
“…Our results demonstrate intelligible speech synthesis from ECoG during both audible and silently mimed speech production. Previous strategies for neural decoding of speech have primarily focused on direct classification of speech segments like phonemes or words 30,31,32,33 . However, these demonstrations have been limited in their ability to scale to larger vocabulary sizes and communication rates.…”
Section: Discussionmentioning
confidence: 99%
“…Our results demonstrate intelligible speech synthesis from ECoG during both audible and silently mimed speech production. Previous strategies for neural decoding of speech have primarily focused on direct classification of speech segments like phonemes or words 30,31,32,33 . However, these demonstrations have been limited in their ability to scale to larger vocabulary sizes and communication rates.…”
Section: Discussionmentioning
confidence: 99%
“…Multiple studies have demonstrated the relevance of this frequency band for examining neural mechanisms of auditory cortical processing (e.g., Crone et al, 2001, 2006; Brugge et al, 2009; Edwards et al, 2009; Mesgarani and Chang, 2012; Steinschneider et al, 2014; Nourski and Howard, 2015). High gamma activity has been directly related to acoustic-phonemic transformations at the level of the STG, which would be a key process required for tasks used in the present study (Mesgarani et al, 2014; Moses et al, 2016). Further, functional neuroimaging studies have demonstrated a positive correlation between high gamma activity and hemodynamic responses (Nir et al, 2007; Whittingstall and Logothetis, 2009).…”
Section: Introductionmentioning
confidence: 85%
“…We recorded the local field potential from each electrode, notch-filtered the signal at 60 Hz and harmonics (120 Hz and 180 Hz) to reduce line-noise related artifacts, and re-referenced to the common average across channels sharing the same connector to the preamplifier (Cheung et al, 2016). We then used the log-analytic amplitude of the Hilbert transform to bandpass signals in the high gamma range (70-150 Hz), using 8 logarithmically-spaced center frequency bands and taking using first principal component across these bands to extract stimulus-related neural activity (Edwards et al, 2009;Moses et al, 2016;Ray and Maunsell, 2011). High gamma signals were then downsampled to 100 Hz for further analysis.…”
Section: Neural Recordingsmentioning
confidence: 99%