Neurobiology of Language - Audiovisual Recalibration of Vowel Categories

23 October 2017
Speech perception is a complex task. It involves making sense of an acoustic signal that is distorted by background noise (traffic noise when talking on the street, other people’s conversations in a noisy bar, and so forth) as well as by large variability in the speech signal itself. One way of dealing with this variability is by making use of multiple sources of information, such as information from both auditory and visual modalities.

In a recent study, Franken and colleagues used an audiovisual paradigm to investigate whether listeners make use of visual information, such as the speaker’s lip movements, in order to recalibrate auditory speech categories. Participants were exposed to videos of a speaker articulating visually either the vowel /e/ or /ø/, while the audio was ambiguous between the two vowels. Participants that were exposed to videos of an /e/ articulation were more likely to interpret the ambiguous vowel sound as /e/ in a later auditory-only identification task, while participants exposed to /ø/ videos were more likely to interpret the ambiguous sound as /ø/. These results suggest that listeners indeed make use of visual information (lip-reading) to recalibrate auditory vowel categories.

Although this type of experiment has been run previously with consonant categories (e.g., distinction between /aba/ and /ada/), this is the first study that investigated audiovisual recalibration in vowels. This is important, as vowels and consonants may behave quite differently. For instance, it is well known that vowels behave less categorically and therefore have fuzzier category boundaries compared to consonants. Indeed, this could mean audiovisual recalibration is even more warranted, especially in contexts where the phoneme boundary’s location is unclear (e.g., noisy acoustic environments or speakers with unfamiliar accents).

Share this page