The SpeechLab facilities


At the MPI there are a number of software tools to manipulate and analyse speech. The total of these tools and accompanying hardware is called the SpeechLab. The software tools can be made available on every Hewlett-Packard and Silicon Graphics UNIX work station. In general staff members at the MPI have their own desktop computer to run the speechlab software. However, for student assistents and guests a special room is available which is also called the Speech Laboratory.

In the Speech Laboratory there are a number of Silicon Graphics UNIX computers dedicated to speech handling. There is also audio equipment for recording and digitising to and from tape.

Software Tools

The main speech processing software used is the WAVES/ESPS package. WAVES is the interactive graphic display part of the package while ESPS is a collection of signal analysis modules that work in close cooperation with xwaves. WAVES is a professional speech analysis tool which gives the users at the MPI all necessary functions: It is possible to extract F0 contours, to generate spectra for analysis purposes, to perform various types of spectral envelop estimation, and to generate color spectrograms. For an example see the following figures. They show the speech waves, the intensity and F0 contours, and the spectrograms of two Vietnamese utterances sharing the same segmental information (i.e. the same phonemes) but having different tones (i.e. mainly different intonation, intensity, and segmental duration). In the left F0 contour one can see the extreme rise at the end of the utterance while in the right example the F0 contour remains flat.

Click here for full display of the speechlab tools

These examples might indicate how useful this tool for various speech analysis tasks can be.

There are also a number of Perl scripts and special programmes available that were developed at the MPI and that augment the basic WAVES/ESPS functionality.


Once a digitised speech file is available, the labeltool program allows the user to quickly label speech segments on the basis of a previously prepared list of label names. The result of the labeling work is an ASCII labelfile. Since this facility is heavily used for various purposes at the MPI it has to be very efficient. Therefore, we were not satisfied with the standard functionality of WAVES, but built our own extension.


SPLICE program
With this program, the user can make a specification of (previously labeled segments and things like noise, beebs etc.) to produce a new speechfile that can be used as stimulus for experiments. The Splice program offers all features which were requested during the last years at the MPI and can therefore be called a mature and comprehensive tool. The output speech file of Splice can be converted to a format used by the NESU experimenting system.


Synthesis program
With this program it is possible to generate synthesized speech of high quality. It allows the user to manipulate the segment durations and performs the manipulations in the time domain representation only, i.e. there are no transformations into a parameter space. The basis of the manipulations is the estimation of the pitch boundaries. Dependent on the specifications of the user either pitch boundary segments are added or subtracted.


Acoustical Band Filter program
This program was created to simulate Hermansky's Acoustical Band Filters which were seen as optimal representations of the speech signal for further processing such as speech recognition. The program generates 16 band filter values for speech frames of a length of about 17 ms.


special scripts
  • to do statistics on pitch contours
  • manipulate label files
  • converting speech files

The following images give a snapshot from the speechlab working environment during a labeling session. At the top one can see the typical WAVES speech wave window. Vertical bars indicate the label segments. Below that window one can identify the labels of the various segments defined in a time-aligned manner. At the left a special window is shown which gives an idea of the user interface of the home-made label tool.

A WAVES display and the labeltool programme Full view (55Kb)

