Anne Cutler


Displaying 1 - 3 of 3
  • Cutler, A., Baldacchino, J., Wagner, A., & Peter, V. (2016). Language-specificity in early cortical responses to speech sounds. Poster presented at the Eighth Annual Meeting of the Society for the Neurobiology of Language (SNL 2016), London, UK.


    The continuity of speech articulation ensures that in all languages, spoken sounds influence one another. Thus there are potentially cues to a sound’s identity in the realisation of surrounding sounds. Listeners make use of such coarticulatory cues – but not always. It has long been known (Harris, Lang. Sp., 1958) that English-speakers use this coarticulation to identify [f] but not [s]. The reason is that place of articulation cues can distinguish [f] from its very close perceptual competitor [θ] (deaf/death), while [s] has no such perceptual competitor and hence less need of such disambiguation. In languages with [f] but no [θ] (e.g., Dutch, Polish), listeners do not use coarticulation to identify [f], whereas listeners do use coarticulation to identify [s] where [s] has close competitors (Polish; Wagner et al., JASA, 2006). The patterning of coarticulation cue use is thus language-specific. In those studies, listeners’ use of coarticulatory cues was revealed by comparing responses to the same sounds in matching versus mismatching phonetic context (e.g., in afa, asa either as originally recorded, or with the consonants cross-spliced); sensitivity to this difference signals attention to coarticulation. We used this same method to assess whether language-specificity could be observed in the early cortical responses to speech, by measuring auditory evoked potentials in response to change in an ongoing sound (Acoustic Change Complex [ACC]; Martin & Boothroyd, JASA, 2000). 18 undergraduate native speakers of Australian English (11 females) heard, while watching silent video, 900 bisyllables (150 repetitions each of afa and asa in original, identity-spliced and cross-spliced realisation, where identity-spliced afa has initial [a] from another utterance of afa, cross-spliced afa has [a] from asa). If the ACC exhibits the language-specific differential response to [f] versus [s], we predict a significant difference across stimulus types (cross-spliced versus the other two stimulus types) for afa but not for asa. Listeners’ EEG was recorded (BioSemi, 64 channels), filtered between 0.1-30 Hz, divided into epochs from -100 to +1000 ms from token onset, and the epochs averaged separately for each bisyllable and stimulus type. The ACC amplitude was calculated from the grand averaged waveform across listeners as the difference in amplitude between the N1 and P2 peaks at the Fz electrode site; these differences were analysed in Bonferroni-corrected planned comparisons across the three stimulus types (unspliced, identity-spliced, cross-spliced) for each of afa and asa. For asa, the planned comparisons showed no differences at all between stimulus types. For afa, in contrast, the comparison between unspliced and cross-spliced stimulus types revealed that cross-spliced tokens generated a significantly smaller ACC: F(1,17)=5.98, p<.05. The amplitudes from the unspliced and identity-spliced afa stimuli however did not significantly differ. These findings indicate that English-speaking listeners’ coarticulation usage patterns – sensitivity to cues in a preceding vowel in the case of [f], insensitivity in the case of [s] – can be detected in the ACC, suggesting that native language experience tailors even the initial cortical responses to speech sounds.
  • Ullas, S., Eisner, F., Cutler, A., & Formisano, E. (2016). Lexical and lip-reading information as sources of phonemic boundary recalibration. Poster presented at the Eighth Annual Meeting of the Society for the Neurobiology of Language (SNL 2016), London, UK.


    Listeners can flexibly adjust boundaries between phonemes when exposed to biased information. Ambiguous sounds are particularly susceptible to being interpreted as certain phonemes depending on the surrounding context, so that if they are embedded into words, the sound can be perceived as the phoneme that would naturally occur in the word. Similarly, ambiguous sounds presented simultaneously with videos of a speaker’s lip movements can also affect the listener’s perception, where the ambiguous sound can be interpreted as the phoneme corresponding with the lip movements of the speaker. These two forms of phonetic boundary recalibration have been demonstrated to be utilized by listeners to adapt in contexts where speech is unclear, due to noise or exposure to a new accent. The current study was designed to directly compare phonemic recalibration effects based on lexical and lip-reading exposures. A specific goal was to investigate how easily listeners are able to follow alternating lexical and lip-reading exposures, in order to determine the most optimal way in which listeners can switch between the two. In the experiment, participants (N=28)were exposed to blocked presentations of words or videos embedded with an individually determined, ambiguous token halfway in between /oop/ or /oot/. In lexical blocks, the stimuli consisted of audio recordings of Dutch words that ended in either /oop/ or /oot/, with the naturally occurring ending replaced with the ambiguous token. In lip-reading exposure blocks, the stimuli were made up of video recordings of the same native Dutch speaker pronouncing pseudo-words that visually appeared to end in /oop/ or /oot/, but the audio of the ending was also replaced with the same ambiguous token. Two types of presentations were administered to two groups of 14, with one version switching the modality of exposure after every block, and the other every four blocks. Following each exposure block, a 6 item post-test was presented, where participants heard the ambiguous token and its two neighbors from a 10-step continuum in isolation, each presented twice, and were asked to report if each sound resembled /oop/ or /oot/. Results from a mixed-factor ANOVA determined that subjects could flexibly adjust phoneme boundaries, as there was a main effect of the phoneme being biased, such that there was a greater proportion of /oot/ responses (pooled across all post-test items) following /oot/ bias blocks than following /oop/ bias blocks, F(1,28) = 15.828, p<0.01. There was also a main effect of exposure type, comparing lexical and lip-reading exposures, F(1,28) = 4.405, p<0.05 which indicated that recalibration strength was stronger following lip-reading exposure than lexical exposure. Additionally, a significant interaction between exposure type and phoneme bias was revealed, F(1,28) = 6.475, p<0.05, showing that the magnitude of the difference between p and t-biased blocks was also greater with lip-reading exposure. No significant differences were found between the two presentation types, neither for exposure type nor for phoneme bias. These results indicate that phoneme boundaries can be influenced by alternating lexical and lip-reading sources of information, and that lip-reading information is especially effective accomplishing this.
  • Cutler, A. (2015). Big issues in speech perception: Abstraction and nativeness [Plenary Lecture]. Talk presented at the 18th International Congress of Phonetic Sciences (ICPhS 2015). Glasgow. 2015-08-10 - 2015-08-14.

Share this page