Friday, January 11, 2008

Phenomenological Model for Vowel Production

Comments on:
"A Phenomenological Model for Vowel Production in the Vocal Tract," Herbert M. Teager, and Shushan M. Teager
==================================================================

The major focus of alternative or a definitive model for speech (in particular, for Vowel production) is to remove the shackles which bind the uncodified and unexplored yet seemingly solved problem of speech because the technology has bypassed the need to do so.
Some scientests believe that 90% of the observations for speech can be explained with source filter theory, but the remaining 10% is not appropriately addressed.

The speech models tacitly depend on the validity of models for hearing. ****
The current model for speech - linear filter model - views voice produced as combination of pure-tones, with ear (imperfect Fourier analyzer) is believed to extract the magnitude of these objective components.
Linear model cannot predict or test / verify the transients / noise tones existing in nature.
The observations made by Teager (to highlight the anamolies in the modeling world) are:
A) the blame of limited and constrained speech systems (and its performance) is laid on our limited understanding of human brain. Teager examplifies by stating that a bird (myna) can replicate (mimic) the long passages of human speech inspite of having anatomically small cochlea (cochlea is assumed to be a place where in humans differentiate the tones into different frequencies, the different hair cells in the ear respond different to the set of frequencies, with certain set (of hair cells) responding to a particular set of frequencies), a different vocal tract (as compared to humans) and a different cerebral cortex structure to mitigate the speech signal between ear and spoken speech.
B) The various clicks, whistles, snores and other types of sounds that humans can produce cannot be modelled by linear conventional source filter model (as they do not assume the source to be glottis).
C) In theory, formant values depends solely upon the cross-sectional areas along the center line of the supraglottal vocal tract

Wednesday, January 9, 2008

Active Fluid Dynamics Voice Production Models

Comments on, "Active Fluid Dynamic Voice Production Models, or There is a Unicorn in the Garden," Herbert M. Teager and Shushan M. Teager.
--------------------------------------------------------------------------------------------


Above is the talk given by Herb Teager to explain the necessity for new model for speech production (focus on non-linear aspect of speech flow).

Herb Teager gives evidence that the speech flow is not at all planar and that it is "separated flow."

Following things bothered Teager the most about the way modeling of speech has been done,

1. About less than 1% of the equivalent mechanical lung input energy is involved in the rate of change of volume velocity at the glottis. The 99.5% of the energy is "used up" somewhere, but the research community is NOT bothered about it.

2. Representing the speech apparatus as a "passive linear system" is not the right one, as the time domain and frequency domain observations aren't completely interchangeable.

3. Related to the behaviour of speech signal under different media (the famous helium effect), the relative shift of the formants (usually seen for the formants below 200Hz) should be proportional to the change in the density, and to an extend related to the atmospheric pressure. But, Teager observed the shift / changes in the speech signal in the other direction, and by a different factor (by the factor of square root of the density).

4. The data (and hence the observations made) obtained using the hot-wire anemometry indicate, (a) "uniform, plane acoustical wave were incompatible with the observed separated flows", (b) flow is separated, possess rotations and doesn't necessarily repeat themselves across the cycle, (c) flow patterns remain consistent for a given vowel, but are radically differ across the phonemes.

5. Flow pulses arising from the front and back of the mouth move at different speeds and attenuations. Most importantly, the pressures were uniform over cross sections within which separate flow occurred. The driving mechanism for the separate flow is still unknown.

6. Acoustic impedance which relates pressures and flows in the sound wave does not translate the flow wave that moved without a corresponding pressure wave.


*** Teager assumes that soliton (and "momentum wave" a termed coined by him) can help explain the phenomenon.

7. The continuity equation and motion equations used to model the speech flow are one dimensional (or can be extended to two-dimensions) but cannot be applied to 3-D and most importantly separated flow, as the equation is over-simplified version. "The basic unsimplified Navier-Stokes equations are intrinsically unstable and unsolvable."

The flow should be considered as "separated flow" made up of threads and gusts which change quickly in direction, time and space, i.e, asymmetry of wave propagation in separated flow.

Then Teager went on the discuss about the Models that he thinks will help us model all or some of the above anamolies.

The basic premise is to omit the effects in the lungs, and just concentrate on the effects or air-flow interactions at the glottis and in the mouth.

The major focus will be to understand the non-linear energy feedback and modulations effects that occur in a jet-cavity interactions, dynamics of rotating vortex flows.

1. A flow leaving a constriction, like a JET, with the cavity (mouth/glottis) being asymmetric, the jet will differentially exhaust one or the other side and willl be thus "attracted" to the nearest wall, but the attachment may be unstable or oscillatory. This separated flow will simulataneously give rise to vortexes (axial, radial or a combination of both).

2. The the complex system at glottis, as glottis is providing a time-varying, directional source of separated air flow in the presence of vortexes modifying those flows with their own internal and external synamics.

3. A jet-cavity interaction can result in:
a) Oscillator when there is regenerative addition of energy when the energy is already high or removal when it is low. Rotating flow can store kinetic energy, and then return it as pressure, with no differential flow. Also the interaction with the separated flows acts as a non-linear yielding wall. In almost all the cases, because of the radial, rotating vortex within, the oscillator exhibits a strong amplitude modulation at the internal vortex precession rate.
b) the axial vortex at the converging outlet - acts as non-linear plug. When pressure in the chamber is high, outward flow is impeded by the compressed outlet vortex, but when pressure is low, the vortex expands and allowsd relatively more exit flow and an increase in vortex strength. Causing oscillations.
If two flows modulated with by different frequencies collide, (with the collision taking place over part of a pitch period), then all sorts of combinations of frequencies will result.
c) if for some reason, the regenerative coupling is small or happens over a part of cycle, then the oscillator will degenerate as a filter.


Teager suggested:
1) to have or perform more experimentation on humans to understand the whole gammit or jet cavity interaction, or similar concept, within the different areas of mouth, throat and the whole system coupling.
2) avoid the computer simulations of the "simplified Navier-Stokes equations".

Teager during the discussion that followed suggested that:
Mask-type pneumotachograph needs to calibrated for pulsatile flows (as these flows occur during speech production) to access the flow variations with instruments other than hot-wire anemometer.

The nominal range for measurements or flows within the mouth / throat:
Frequency range: 0 -4 KHz
Physiological range: 10-300 cm/sec

Frequency response of the instrumentation usd by Teager is float (in 0 - 5KHz) and is independent of the flow rate.

-----------------------------------------------------------------------------------------------
*** we have affirmed that the seven dimensional (three linear velocity and three angular velocity, plus time) flow patterns are unique for each vowel, and have published four sets of manifestly different trajectory data to back up this postulate.
-----------------------------------------------------------------------------------------------