NATO NATO Logo Advanced Study Institute 


Speech processing

Manfred Schroeder
Universitaet Goettingen

The processing of speech signals has a long and venerable history. As early as 1770 Wolfgang von Kempelen demonstrated his mechanical talking machine to the courts of Europe. In 1928 Homer Dudley invented the 'vocoder' (voice coder) arguing that speech is specified by a few, slowly varying parameters requiring only a fraction of the telephone bandwidth for transmission. A digital vocoder was first put into service in Word War II for a secure telephone link connecting Roosevelt, Churchill and major military commands around the world.
Exploiting properties of the human ear, such as 'phase deafness' and auditory masking, perceptual coders have been demonstrated that transmit speech (and even high-quality music!) at fractional bits per sample. Applications for mobile radio, voice-email, and Internet radio abound.
The success of speech recognition depends on the size of the vocabulary and the quality of the speech signal. The zero-error recognition of unrestricted, continuous speech from a noisy environment (the 'electronic secretary'), however, is still in the future.
Speaker identification has helped solve several disasters (mid-air collision over the Grand Canyon, burning-up of three astronauts). But its forensic applications are limited if the pool of potential speakers is large. Speaker verification is of increasing importance in limiting access to restricted data (financial, medical, military).
Text-to-speech (TTS), although still suffering from an 'electronic accent', has inumerable applications from 'talking books' for the blind to a wide variety of spoken-language information services.

Manfred Schroeder studied mathematics and physics at the University of Goettingen, Germany. In 1954 he joined Bell Laboratories in Murray Hill, New Jersey. At the Labs he worked in speech, hearing and room acoustics until 1987. In 1969 he became a professor of physics at Goettingen, commuting twice yearly to Bell. Schroeder holds 45 U.S. patents for inventions in speech processing and other fields. He was a member of the National Stereophonic Radio Committee that set the standards for FM stereo broadcasting.
In 1972 Schroeder was awarded the Gold Medal of the Audio Engineering Society. He also received the Rayleigh Medal from the British Institute of Acoustics and the Helmholtz Medal from the German Acoustical Society. In 1991 the Acoustical Society of America awarded him its Gold Medal "for his theoretical and practical contributions to human communication through innovative applications of mathematics to speech, hearing and concert hall acoustics."
An early practitioner of computer graphics, Schroeder won First Prize at the 1969 International Computer Art Competition in Las Vegas.
Schroeder has written two books: "Number Theory in Science and Communication" and "Fractals, Chaos, Power Laws: Minutes from an Infinite Paradise." He also edited the books "Speech and Speaker Recognition" and "Hundert Jahre Friedrich Hund."
Schroeder is a member of the National Academy of Engineering, a Fellow of the American Academy of Arts and Sciences and the New York and Goettingen Academies. He is also a founding member of the Institut de Recherche Acoustique/Musique, Centre Pompidou, Paris.

