Study of Human Phonation with Attention on Sub-Grid Scale Turbulence Models and Initialization Effects

The first aim of this paper is to compare three cases, where two cases contain turbulence sub-grid scale (SGS) models, which are commonly applied in wall-bounded flows. They use a bit different formulation of how to estimate the eddy-viscosity fields in a vicinity of walls. The SGS effect is obvious on flow rates through an intra-glottal gap. The second aim is to attend to a direct impact of the specific SGS model onto the sound pressure levels of frequency components (human formants). The third aim is focused on the effect of an initial time, when the vocal folds are in a start convergent phase itself and when the flow is suddenly accelerated due to boundary conditions. The effect is shown at aeroacoustic spectra.


Introduction
The generation of the human voice itself is quite a complicated biophysical process concerning flow-structure-acoustic interactions. Viscoelastic multi-layer tissues covering vocal folds interact with air by oscillation (monopole acoustic source). This kind of oscillation and the expelled air ensure conditions for a creation of turbulent structures (dipole acoustic source). Finally, the frequency components are modulated by a specific vocal tract (quadrupole sources) and the acoustic energy is radiated outside to a far field.
To deal with this task is performed the hybrid method connecting these three approaches: 1. Computational Fluid Dynamics (CFD), 2. Conservative interpolation from a flow grid to an acoustic grid and 3. Computational AeroAcoustic (CAA) method. The enormous range scale between acoustic and flow variables was reduced by the Perturbed Convective Wave Equation (PCWE) approach [1], [2].

CFD modelling
The mathematical model has kept the following assumptions: 1. Divergence of the velocity vector is zero (i.e. the incompressible flow with a constant density, the Mach number <0.1), 2. Source term is kept to zero, 3. The diffusive term is non-zero (i.e. the viscous flow regime); The momentum transfer is mediated by macroscopic vortices, therefore the Large-eddy Simulation (LES) method performed the large-scale vortices by unsteady Navier-Stokes Equations and the small-scale vortices are modelled by the SGS turbulence model. The filtered momentum (vector) equation is presented in (1), containing from left side the time derivative of a particle motion, the convective term, the gradient of the static pressure, the laplacian of the velocity vector with a constant molecular kinematic viscosity (i.e. the diffusive term) and the divergence of the stress tensor.
These SGS turbulent models were used in this paper: 1.) One-equation model [3]. The transport equation for the turbulent energy is defined by (2), where the first term at the RHS is a dot product of the stress tensor and the symmetric part of the velocity gradient, i.e. the product term, leading by the formula in (3) and the turbulent viscosity is modelled, as it is seen in (4). The rest unknowns: Δ is a cell-of-volume length and the constants Cx concerning the model.
2.) WALE model [4], where is added the deviatoric part of the velocity gradient and this contribution is a key part, which prevents numerical instabilities, guaranteeing that the denominator does not converge to zero in cases with pure shear. The equation for the turbulent viscosity is in (5).

CAA modelling
This model respects following conditions: 1. The rotation of the vector of the acoustic velocity is kept to zero, 2. The pressure is dependent only on density, 3. Condition of incompressibility is allowed. To interpolate the source term from the flow grid is used the PCWE, which divides the pressure and velocity to a solenoidal (incompressible) and an acoustic contribution. The PCWE equation (6) contains an acoustic potential (scalar), which saves computational costs against the acoustic particle velocity (vector) and the relation is ̅ = −∇ ̅ . The last step is to compute the wave equation (WE), which is in the well-known form in (7), where the sound source term is hidden in the ̅ , i.e. the second time derivative of ̅ .

CFD-CAA mesh, numerical schemes, etc
The view of the computational CFD domain of the vocal folds in a coronal section is captured in Figure 1 and the 3D CFD domain consists of 2.2M block-structured hexahedral elements. The positions of vocal folds itself are in convergent shape with the gap g, which is not fully "in touch", as it should be in physiologically healthy phonation. The oscillation of vocal folds is prescribed by a function, in Table 1 the form is shown, and it allows two degrees of freedom, where the amplitudes 1,2 are both 0.3 mm, the fundamental frequency fo corresponding to 100 Hz and the phases are 1 = /2 and 2 = 0. The flow is driven by the constant pressure difference 307.5 Pa (i.e. lung pressure). The full-domain used for CAA simulations is in Figure 2, where the vocal tract is made from multiple frustums concatenated one after another (19k and 23k hexahedral elements for [u:] and [i:], respectively). The radius of the frustums presented on the Figure 2 determines the artificial mouth for the sound "who" (vowel u:). The 3D vocal tract shapes were built in accordance to the study [5] using magnetic resonance imaging (MRI)   For the spatial approximation was used the central difference scheme. To avoid the numerical dissipation is suitable to use non-dissipative CDS, which unfortunately brings problems with stability too. For the time discretization was used the implicit scheme with the second order of accuracy; The hardware computational costs are connected with the parameters of our university cluster Charon: The CFD simulations were run in parallel on 20 cores, composed on nodes with two 10-core Intel Xeon Silver 4114 CPUs with 96GB RAM. In Table 2 is presented the computational time requirements for each sub-case. The used software for finite volume CFD computations was used OpenFOAM [6] and afterwards the interpolation process, i.e. get the data from the flow grid to the acoustic grid and estimate the PCWE computation, is used a CFSDat, which is a data transformation tool, distributed as a part of a CFS++ [7]. The WE is computed also by the finite element software the CFS++.

Results and discussion
In Table 2 are shown the study cases. The cases marked as "1" are computed with the effect of the initialization, which included the flow data at t in <0.0;0.2> s, with a time step Δt = 0.0001 s and the subscript "2" means, that the computation is without the initialization effect, using the flow data started from the 1/100 of the first period of the vocal folds oscillation to the end of the 19th oscillation cycle, i.e. t in <0.0001;0.2> s. The maximum flow rate with the One-Equation and the WALE model were damped around 16.76 % and 5.23 % in contrast to the laminar A11. The increase of the flow rate, catched by the WALE SGS model, is caused by the formula for the turbulence viscosity, where the denominator is able to avoid zero for a pure shear strain or a rotational strain. The WALE SGS model vanished the high turbulent viscosity spots within the glottis and powered the amount of kinetic energy transferred from lungs to the supraglottal vestibule, in detail it is +50 ml/s (12 %) of airflow in comparison to the One-Equation model.
The monitoring point "mic1" (it was seen in Figure 2) was used at a far free-field and the signals were analyzed and compared in frequency domains (Figures 3a-7b). The acoustic spectrum in Figure 3a is considerably affected by omitting the initial step and the sound pressure level of the fundamental frequency : at 100 Hz is strengthened by 6 dB. In Figure   3b is the situation very similar, even the : and two higher harmonics in C11-2 are almost at the same level around 50 dB. In Figures 4a-5b is obvious, that the initial effects on the :

Conclusion
Numerical solutions of a subsonic incompressible on the computational grid of larynx were computed by Large-Eddy Simulation method and different sub-grid scale models were applied and compared from the aeroacoustic point of view. To summarize the WALE model estimated a more realistic turbulent eddy-viscosity field near the constricted vocal fold margins than the One-Equation SGS model and therefore it empowered the sound source terms. It is worth emphasising that the WALE SGS model positively modified the sound pressure levels of the: fundamental frequency, higher harmonics and frequencies defining human phonation process for vowel [u:] and [i:] at all. The idea of not including the initial flow field in aeroacoustic computations turned out to have a sense in the laminar case in a low-frequency band, where the sound pressure levels were strengthened, whereas in the cases with SGS models the initialization effect helped to improve the sound pressure level of fundamental frequency and showed less or negligible effects on human formants.