In a recent study, Franken and colleagues used an audiovisual paradigm to investigate whether listeners make use of visual information, such as the speaker’s lip movements, in order to recalibrate auditory speech categories. Participants were exposed to videos of a speaker articulating visually either the vowel /e/ or /ø/, while the audio was ambiguous between the two vowels. Participants that were exposed to videos of an /e/ articulation were more likely to interpret the ambiguous vowel sound as /e/ in a later auditory-only identification task, while participants exposed to /ø/ videos were more likely to interpret the ambiguous sound as /ø/. These results suggest that listeners indeed make use of visual information (lip-reading) to recalibrate auditory vowel categories.
Although this type of experiment has been run previously with consonant categories (e.g., distinction between /aba/ and /ada/), this is the first study that investigated audiovisual recalibration in vowels. This is important, as vowels and consonants may behave quite differently. For instance, it is well known that vowels behave less categorically and therefore have fuzzier category boundaries compared to consonants. Indeed, this could mean audiovisual recalibration is even more warranted, especially in contexts where the phoneme boundary’s location is unclear (e.g., noisy acoustic environments or speakers with unfamiliar accents).