Interspeech 2022 2022
DOI: 10.21437/interspeech.2022-10868
|View full text |Cite
|
Sign up to set email alerts
|

Data Augmentation for End-to-end Silent Speech Recognition for Laryngectomees

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 0 publications
0
5
0
Order By: Relevance
“…The amplitude scaling factor was set at 0.02, the number of oscillations per second (Hz) of the noise was 40, and the phase was set to zero. For example, if the mean amplitude of an articulatory dimension was A, the sinusoidal noise that was added would 4) Random scaling (RS): A recent study [21], has shown the implementation of random scaling over EMA signals. With this in mind, we decided to explore a data augmentation strategy that involved altering the duration of the samples by randomly stretching or shrinking them on our ultrasound tongue image dataset (Fig.…”
Section: ) Consecutive Time Masking (Ctm)mentioning
confidence: 99%
See 2 more Smart Citations
“…The amplitude scaling factor was set at 0.02, the number of oscillations per second (Hz) of the noise was 40, and the phase was set to zero. For example, if the mean amplitude of an articulatory dimension was A, the sinusoidal noise that was added would 4) Random scaling (RS): A recent study [21], has shown the implementation of random scaling over EMA signals. With this in mind, we decided to explore a data augmentation strategy that involved altering the duration of the samples by randomly stretching or shrinking them on our ultrasound tongue image dataset (Fig.…”
Section: ) Consecutive Time Masking (Ctm)mentioning
confidence: 99%
“…Therefore, data augmentation plays a crucial role in SSI with UTI by providing additional training examples to prevent overfitting and enhance the performance of deep learning models, given the limited amount of available data. Data augmentation has been proposed as a method to generate additional training data for end-to-end SSR on EMA datasets: Cao and his colleagues applied data augmentation strategies to raw kinematic signals [21].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Evaluators may need to consider multiple aspects of speech, such as pitch, tone, articulation, and prosody, which can make the evaluation more challenging [29]. Third, there is a limited amount of training data available for evaluating alaryngeal speech, as it is a relatively rare condition [30]. This can make it difficult to develop standardized evaluation methods and norms for different types of alaryngeal speech [31].…”
Section: Assessing Speech-signal Impairmentsmentioning
confidence: 99%
“…In dealing with these urgent challenges in sustainable urban living, artificial intelligence (AI)-based applications play an important role. State-of-the-art AI-based technologies in image processing [1][2][3], video processing [4,5], speech and audio processing [6][7][8][9], music processing [10], natural language processing [11], multimodality processing [12][13][14], Internet of Things [15], edge computing [16], autonomous driving [17], heterogeneous computing [18][19][20], wireless networks [21][22][23], social science [24] and smart healthcare [25][26][27][28] could be helpful in adding intelligence to urban living and will provide better solutions to address challenges in sustainable urban living.…”
Section: Introductionmentioning
confidence: 99%