Purpose
To enable dynamic speech imaging with high spatiotemporal resolution and full-vocal-tract spatial coverage, leveraging recent advances in sparse sampling.
Methods
An imaging method is developed to enable high-speed dynamic speech imaging exploiting low-rank and sparsity of the dynamic images of articulatory motion during speech. The proposed method includes: a) a novel data acquisition strategy that collects navigators with high temporal frame rate, and b) an image reconstruction method that derives temporal subspaces from navigators and reconstructs high-resolution images from sparsely sampled data with joint low-rank and sparsity constraints.
Results
The proposed method has been systematically evaluated and validated through several dynamic speech experiments. A nominal imaging speed of 102 frames per second (fps) was achieved for a single-slice imaging protocol with a spatial resolution of 2.2 × 2.2 × 6.5 mm3. An eight-slice imaging protocol covering the entire vocal tract achieved a nominal imaging speed of 12.8 fps with the identical spatial resolution. The effectiveness of the proposed method and its practical utility was also demonstrated in a phonetic investigation.
Conclusion
High spatiotemporal resolution with full-vocal-tract spatial coverage can be achieved for dynamic speech imaging experiments with low-rank and sparsity constraints.
Purpose
To enable a more comprehensive view of articulations during speech through near‐isotropic 3D dynamic MRI with high spatiotemporal resolution and large vocal‐tract coverage.
Methods
Using partial separability model‐based low‐rank reconstruction coupled with a sparse acquisition of both spatial and temporal models, we are able to achieve near‐isotropic resolution 3D imaging with a high frame rate. The total acquisition time of the speech acquisition is shortened by introducing a sparse temporal sampling that interleaves one temporal navigator with four randomized phase and slice‐encoded imaging samples. Memory and computation time are improved through compressing coils based on the region of interest for low‐rank constrained reconstruction with an edge‐preserving spatial penalty.
Results
The proposed method has been evaluated through experiments on several speech samples, including a standard reading passage. A near‐isotropic 1.875 × 1.875 × 2 mm3 spatial resolution, 64‐mm through‐plane coverage, and a 35.6‐fps temporal resolution are achieved. Investigations and analysis on specific speech samples support novel insights into nonsymmetric tongue movement, velum raising, and coarticulation events with adequate visualization of rapid articulatory movements.
Conclusion
Three‐dimensional dynamic images of the vocal tract structures during speech with high spatiotemporal resolution and axial coverage is capable of enhancing linguistic research, enabling visualization of soft tissue motions that are not possible with other modalities.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.