Speech has been remade automatically from a buzzer-like tone and a hiss-like noise corresponding to the cord-tone and the breath-tone of normal speech. Control of pitch and spectrum obtained from a talker's speech are applied to make the synthetic speech copy the original speech sufficiently for good intelligibility although the currents used in such controls contain only low syllabic frequencies of the order of 10 cycles per second as contrasted with frequencies of 100 to 3000 cycles in the remade speech. The isolation of these speech-defining signals of pitch and spectrum makes it possible to reconstruct the speech to a wide variety of specifications. Striking demonstrations upon altering the pitch of the remade speech stress the contribution of the pitch to the emotional content of speech. Similarly the spectrum is shown to contribute most of the intelligibility to the speech. Serious attempts at speech synthesis started with yonKempelen's manually operated speaking machine which he described in his book published in 1791. The generation of 'vowel sounds has been studied by Willis, Wheatstone, Helmholtz, Scripture and many others. Stewart (Nature, Sept. 2, 1922) built a dial-controlled electrical synthesizer of simple speech sounds; this type of circuit was demonstrated publicly by Fletcher in February, 1924. 169 Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 132.174.255.116 On: Thu, I 3 FREQUENCY DISC•IMINATO•I I FREQUENCY METEr J 0-25• I•ELAXATION OSCILLATOR i J GENErATOr SOUND-STrEAM OUTPUT TO SPECTRUM•NT•OL CIRCUI• Fro. 4. Pitch control circuit.
Speech synthesizing is here discussed in the terminology of carrier circuits. The speaker is pictured as a sort of radio broadcast transmitter with the message to be sent out originating in the studio of the talker's brain and manifesting itself in muscular wave motions in the vocal tract. Although these motions contain the message, they are inaudible because they occur at syllabic rates. An audible sound is needed to pass the message into the listener's ear. This is provided by the carrier in the form of a group of higher frequency waves in the audible range set up by oscillatory action at the vocal cords or elsewhere in the vocal tract. These carrier waves either in their generation or their transmission are modulated by the message waves to form the speech waves. As the speech waves contain the message information on an audible carrier they are adapted to broadcast reception by receiving sets in the form of listeners' ears. The message is then recovered by the listeners' minds.
In this appendix the relationship between the hearing loss for speech •'4 and the hearing loss audiogram will be considered. Let/•s be the hearing loss at the frequency f for a pure tone. It is the ordinate in the audiogram. If we consider/•s has the same effect upon the threshold level as an attenuation --R from the flat response system, then by analogy to Eq. (23) the hearing loss for speech/•, is given by 10-•'/'ø= f0 G'lO-•s/'ødf' (99) ,4 H. Fletcher, "3, method of calculating hearing loss for speech from an audiogram," J. Acous. Soc. Am. 22, 1 (1950).
Speech has been remade from a buzzer-like tone and a hiss-like noise with automatic control of the pitch and timbre by the talker's speech that is being remade. The newly created speech may be altered from the original in a variety of ways since the fundamental variables of speech are under experimental control between the steps of speech analyzing and synthesizing that together remake the speech. The demonstrations will show in detail the steps in remaking speech as well as the effects of changing various elements. Included among these demonstrations will be the automatic raising and lowering of pitch, changing the octave range of pitch, reversing the inflection, removing the melody from song, transforming voiced speech to a whisper, the relative contributions of the buzz and hiss to the remade speech, etc. The demonstrations will show some striking characteristics of speech such as its recreation under the control of currents too low in frequency to be audible and the use of timbre for intelligibility and of inflection for emotional content. The synthesizing process here used provided the basis for the synthesis of speech by the manipulation of keys as accomplished in the Voder, one of the Bell System's exhibits at this year's World's Fair.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.