2022
DOI: 10.3390/brainsci12070818
|View full text |Cite
|
Sign up to set email alerts
|

Sequence-to-Sequence Voice Reconstruction for Silent Speech in a Tonal Language

Abstract: Silent speech decoding (SSD), based on articulatory neuromuscular activities, has become a prevalent task of brain–computer interfaces (BCIs) in recent years. Many works have been devoted to decoding surface electromyography (sEMG) from articulatory neuromuscular activities. However, restoring silent speech in tonal languages such as Mandarin Chinese is still difficult. This paper proposes an optimized sequence-to-sequence (Seq2Seq) approach to synthesize voice from the sEMG-based silent speech. We extract dur… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 46 publications
(76 reference statements)
0
1
0
Order By: Relevance
“…Technologies to capture these biosignals include vocal tract imaging [14], magnetic tracing [15], electroencephalogram [16], and EMG [17,18]. The conversion from these silent biosignals to audible speech can be done directly-using some machine-learning algorithms that model the relationship between the feature vectors extracted from the biosignals and the acoustic signals [5,19]-or indirectly-by first producing the related text [20][21][22] and then using a text-to-speech (TTS) model to generate synthetic speech.…”
Section: Introductionmentioning
confidence: 99%
“…Technologies to capture these biosignals include vocal tract imaging [14], magnetic tracing [15], electroencephalogram [16], and EMG [17,18]. The conversion from these silent biosignals to audible speech can be done directly-using some machine-learning algorithms that model the relationship between the feature vectors extracted from the biosignals and the acoustic signals [5,19]-or indirectly-by first producing the related text [20][21][22] and then using a text-to-speech (TTS) model to generate synthetic speech.…”
Section: Introductionmentioning
confidence: 99%