“…Similar to VUIs, SSIs allow users to converse with computers in natural language, which provides expressive commands without requiring them to remember complicated actions or gestures. Existing SSIs are characterized by what kind of sensing methods and biosignals are used, such as tracking the movement of speech articulators using electromagnetic articulography (EMA) [13,17,53], vocal tract imaging using ultrasound imaging [22,35], capturing subtle sounds produced by non-audible murmur (NAM) [59][60][61] and ingressive speech [15], placing capacitive sensors inside the mouth [33,40], and capturing facial electrical activity using electromyography (sEMG) [31,65]. In the field of Brain-Computer Interfaces (BCI), researchers seek to decode human speech directly from the electrical activity of the brain, where the approaches can be categorized into invasive systems implanted in the cerebral cortex using electrocorticography (ECoG) [1,49] and non-invasive systems attached to the scalp using Electroencephalogram (EEG) [18,20,47].…”