For many years, the vocal tract shape has been approximated by one-dimensional (1D) area functions to study the production of voice. More recently, 3D approaches allow one to deal with the complex 3D vocal tract, although area-based 3D geometries of circular cross-section are still in use. However, little is known about the influence of performing such a simplification, and some alternatives may exist between these two extreme options. To this aim, several vocal tract geometry simplifications for vowels [A], [i], and [u] are investigated in this work. Six cases are considered, consisting of realistic, elliptical, and circular cross-sections interpolated through a bent or straight midline. For frequencies below 4-5 kHz, the influence of bending and cross-sectional shape has been found weak, while above these values simplified bent vocal tracts with realistic cross-sections are necessary to correctly emulate higher-order mode propagation. To perform this study, the finite element method (FEM) has been used. FEM results have also been compared to a 3D multimodal method and to a classical 1D frequency domain model.
Working with the wave equation in mixed rather than irreducible form allows one to directly account for both, the acoustic pressure field and the acoustic particle velocity field. Indeed, this becomes the natural option in many problems, such as those involving waves propagating in moving domains, because the equations can easily be set in an arbitrary Lagrangian-Eulerian (ALE) frame of reference. Yet, when attempting a standard Galerkin finite element solution (FEM) for them, it turns out that an inf-sup compatibility constraint has to be satisfied, which prevents from using equal interpolations for the approximated acoustic pressure and velocity fields. In this work it is proposed to resort to a subgrid scale stabilization strategy to circumvent this condition and thus facilitate code implementation. As a possible application, we address the generation of diphthongs in voice production.
In this paper, a multimodal theory accounting for higher order acoustical propagation modes is presented as an extension to the classical plane wave theory. This theoretical development is validated against experiments on vocal tract replicas, obtained using a 3D printer and finite element simulations. Simplified vocal tract geometries of increasing complexity are used to investigate the influence of some geometrical parameters on the acoustical properties of the vocal tract. It is shown that the higher order modes can produce additional resonances and anti-resonances and can also strongly affect the radiated sound. These effects appear to be dependent on the eccentricity and the cross-sectional shape of the geometries. Finally, the comparison between the simulations and the experiments points out the importance of taking visco-thermal losses into account to increase the accuracy of the resonance bandwidths prediction.
One of the key effects to model in voice production is that of acoustic radiation of sound waves emanating from the mouth. The use of three-dimensional numerical simulations allows to naturally account for it, as well as to consider all geometrical head details, by extending the computational domain out of the vocal tract. Despite this advantage, many approximations to the head geometry are often performed for simplicity and impedance load models are still used as well to reduce the computational cost. In this work, the impact of some of these simplifications on radiation effects is examined for vowel production in the frequency range 0-10 kHz, by means of comparison with radiation from a realistic head. As a result, recommendations are given on their validity depending on whether high frequency energy (above 5 kHz) should be taken into account or not.
Three-dimensional (3-D) numerical approaches for voice production are currently being investigated and developed. Radiation losses produced when sound waves emanate from the mouth aperture are one of the key aspects to be modeled. When doing so, the lips are usually removed from the vocal tract geometry in order to impose a radiation impedance on a closed cross-section, which speeds up the numerical simulations compared to free-field radiation solutions. However, lips may play a significant role. In this work, the lips' effects on vowel sounds are investigated by using 3-D vocal tract geometries generated from magnetic resonance imaging. To this aim, two configurations for the vocal tract exit are considered: with lips and without lips. The acoustic behavior of each is analyzed and compared by means of time-domain finite element simulations that allow free-field wave propagation and experiments performed using 3-D-printed mechanical replicas. The results show that the lips should be included in order to correctly model vocal tract acoustics not only at high frequencies, as commonly accepted, but also in the low frequency range below 4 kHz, where plane wave propagation occurs.
The experimental two-microphone transfer function method (TMTF) is adapted to the numerical framework to compute the radiation and input impedances of three-dimensional vocal tracts of elliptical cross section. In its simplest version, the TMTF method only requires measuring the acoustic pressure at two points in an impedance duct and the postprocessing of the corresponding transfer function. However, some considerations are to be taken into account when using the TMTF method in the numerical context, which constitute the main objective of this paper. In particular, the importance of including absorption at the impedance duct walls to avoid lengthy numerical simulations is discussed and analytical complex axial wave numbers for elliptical ducts are derived for this purpose. It is also shown how the plane wave restriction of the TMTF method can be circumvented to some extent by appropriate location of the virtual microphones, thus extending the method frequency range of validity. Virtual microphone spacing is also discussed on the basis of the so called singularity factor. Numerical examples include the computation of the radiation impedance of vowels /a/, /i/ and /u/ and the input impedance of vowel /a/, for simplified vocal tracts of circular and elliptical cross sections.
Two-dimensional (2D) numerical simulations of vocal tract acoustics may provide a good balance between the high quality of three-dimensional (3D) finite element approaches and the low computational cost of one-dimensional (1D) techniques. However, 2D models are usually generated by considering the 2D vocal tract as a midsagittal cut of a 3D version, i.e., using the same radius function, wall impedance, glottal flow, and radiation losses as in 3D, which leads to strong discrepancies in the resulting vocal tract transfer functions. In this work, a four step methodology is proposed to match the behavior of 2D simulations with that of 3D vocal tracts with circular cross-sections. First, the 2D vocal tract profile becomes modified to tune the formant locations. Second, the 2D wall impedance is adjusted to fit the formant bandwidths. Third, the 2D glottal flow gets scaled to recover 3D pressure levels. Fourth and last, the 2D radiation model is tuned to match the 3D model following an optimization process. The procedure is tested for vowels /a/, /i/, and /u/ and the obtained results are compared with those of a full 3D simulation, a conventional 2D approach, and a 1D chain matrix model.
Medical imaging techniques are usually utilized to acquire the vocal tract geometry in 3D, which may then be used, eg, for acoustic/fluid simulation. As an alternative, such a geometry may also be acquired from a biomechanical simulation, which allows to alter the anatomy and/or articulation to study a variety of configurations. In a biomechanical model, each physical structure is described by its geometry and its properties (such as mass, stiffness, and muscles). In such a model, the vocal tract itself does not have an explicit representation, since it is a cavity rather than a physical structure. Instead, its geometry is defined implicitly by all the structures surrounding the cavity, and such an implicit representation may not be suitable for visualization or for acoustic/fluid simulation. In this work, we propose a method to reconstruct the vocal tract geometry at each time step during the biomechanical simulation. Complexity of the problem, which arises from model alignment artifacts, is addressed by the proposed method. In addition to the main cavity, other small cavities, including the piriform fossa, the sublingual cavity, and the interdental space, can be reconstructed. These cavities may appear or disappear by the position of the larynx, the mandible, and the tongue. To illustrate our method, various static and temporal geometries of the vocal tract are reconstructed and visualized. As a proof of concept, the reconstructed geometries of three cardinal vowels are further used in an acoustic simulation, and the corresponding transfer functions are derived.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.