Anne Cutler

Publications

Displaying 1 - 19 of 19
  • Asano, Y., Yuan, C., Grohe, A.-K., Weber, A., Antoniou, M., & Cutler, A. (2020). Uptalk interpretation as a function of listening experience. In N. Minematsu, M. Kondo, T. Arai, & R. Hayashi (Eds.), Proceedings of Speech Prosody 2020 (pp. 735-739). Tokyo: ISCA. doi:10.21437/SpeechProsody.2020-150.

    Abstract

    The term “uptalk” describes utterance-final pitch rises that carry no sentence-structural information. Uptalk is usually dialectal or sociolectal, and Australian English (AusEng) is particularly known for this attribute. We ask here whether experience with an uptalk variety affects listeners’ ability to categorise rising pitch contours on the basis of the timing and height of their onset and offset. Listeners were two groups of English-speakers (AusEng, and American English), and three groups of listeners with L2 English: one group with Mandarin as L1 and experience of listening to AusEng, one with German as L1 and experience of listening to AusEng, and one with German as L1 but no AusEng experience. They heard nouns (e.g. flower, piano) in the framework “Got a NOUN”, each ending with a pitch rise artificially manipulated on three contrasts: low vs. high rise onset, low vs. high rise offset and early vs. late rise onset. Their task was to categorise the tokens as “question” or “statement”, and we analysed the effect of the pitch contrasts on their judgements. Only the native AusEng listeners were able to use the pitch contrasts systematically in making these categorisations.
  • Bruggeman, L., & Cutler, A. (2020). No L1 privilege in talker adaptation. Bilingualism: Language and Cognition, 23(3), 681-693. doi:10.1017/S1366728919000646.

    Abstract

    As a rule, listening is easier in first (L1) than second languages (L2); difficult L2 listening can challenge even highly proficient users. We here examine one particular listening function, adaptation to novel talkers, in such a high-proficiency population: Dutch emigrants to Australia, predominantly using English outside the family, but all also retaining L1 proficiency. Using lexically-guided perceptual learning (Norris, McQueen & Cutler, 2003), we investigated these listeners’ adaptation to an ambiguous speech sound, in parallel experiments in both their L1 and their L2. A control study established that perceptual learning outcomes were unaffected by the procedural measures required for this double comparison. The emigrants showed equivalent proficiency in tests in both languages, robust perceptual adaptation in their L2, English, but no adaptation in L1. We propose that adaptation to novel talkers is a language-specific skill requiring regular novel practice; a limited set of known (family) interlocutors cannot meet this requirement.
  • Ip, M. H. K., & Cutler, A. (2020). Universals of listening: Equivalent prosodic entrainment in tone and non-tone languages. Cognition, 202: 104311. doi:10.1016/j.cognition.2020.104311.

    Abstract

    In English and Dutch, listeners entrain to prosodic contours to predict where focus will fall in an utterance. Here, we ask whether this strategy is universally available, even in languages with very different phonological systems (e.g., tone versus non-tone languages). In a phoneme detection experiment, we examined whether prosodic entrainment also occurs in Mandarin Chinese, a tone language, where the use of various suprasegmental cues to lexical identity may take precedence over their use in salience. Consistent with the results from Germanic languages, response times were facilitated when preceding intonation predicted high stress on the target-bearing word, and the lexical tone of the target word (i.e., rising versus falling) did not affect the Mandarin listeners' response. Further, the extent to which prosodic entrainment was used to detect the target phoneme was the same in both English and Mandarin listeners. Nevertheless, native Mandarin speakers did not adopt an entrainment strategy when the sentences were presented in English, consistent with the suggestion that L2 listening may be strained by additional functional load from prosodic processing. These findings have implications for how universal and language-specific mechanisms interact in the perception of focus structure in everyday discourse.

    Additional information

    supplementary data
  • Yu, J., Mailhammer, R., & Cutler, A. (2020). Vocabulary structure affects word recognition: Evidence from German listeners. In N. Minematsu, M. Kondo, T. Arai, & R. Hayashi (Eds.), Proceedings of Speech Prosody 2020 (pp. 474-478). Tokyo: ISCA. doi:10.21437/SpeechProsody.2020-97.

    Abstract

    Lexical stress is realised similarly in English, German, and Dutch. On a suprasegmental level, stressed syllables tend to be longer and more acoustically salient than unstressed syllables; segmentally, vowels in unstressed syllables are often reduced. The frequency of unreduced unstressed syllables (where only the suprasegmental cues indicate lack of stress) however, differs across the languages. The present studies test whether listener behaviour is affected by these vocabulary differences, by investigating German listeners’ use of suprasegmental cues to lexical stress in German and English word recognition. In a forced-choice identification task, German listeners correctly assigned single-syllable fragments (e.g., Kon-) to one of two words differing in stress (KONto, konZEPT). Thus, German listeners can exploit suprasegmental information for identifying words. German listeners also performed above chance in a similar task in English (with, e.g., DIver, diVERT), i.e., their sensitivity to these cues also transferred to a nonnative language. An English listener group, in contrast, failed in the English fragment task. These findings mirror vocabulary patterns: German has more words with unreduced unstressed syllables than English does.
  • Mandal, S., Best, C. T., Shaw, J., & Cutler, A. (2020). Bilingual phonology in dichotic perception: A case study of Malayalam and English voicing. Glossa: A Journal of General Linguistics, 5(1): 73. doi:10.5334/gjgl.853.

    Abstract

    Listeners often experience cocktail-party situations, encountering multiple ongoing conversa- tions while tracking just one. Capturing the words spoken under such conditions requires selec- tive attention and processing, which involves using phonetic details to discern phonological structure. How do bilinguals accomplish this in L1-L2 competition? We addressed that question using a dichotic listening task with fluent Malayalam-English bilinguals, in which they were pre- sented with synchronized nonce words, one in each language in separate ears, with competing onsets of a labial stop (Malayalam) and a labial fricative (English), both voiced or both voiceless. They were required to attend to the Malayalam or the English item, in separate blocks, and report the initial consonant they heard. We found that perceptual intrusions from the unattended to the attended language were influenced by voicing, with more intrusions on voiced than voiceless tri- als. This result supports our proposal for the feature specification of consonants in Malayalam- English bilinguals, which makes use of privative features, underspecification and the “standard approach” to laryngeal features, as against “laryngeal realism”. Given this representational account, we observe that intrusions result from phonetic properties in the unattended signal being assimilated to the closest matching phonological category in the attended language, and are more likely for segments with a greater number of phonological feature specifications.
  • Ullas, S., Formisano, E., Eisner, F., & Cutler, A. (2020). Audiovisual and lexical cues do not additively enhance perceptual adaptation. Psychonomic Bulletin & Review, 27, 707-715. doi:10.3758/s13423-020-01728-5.

    Abstract

    When listeners experience difficulty in understanding a speaker, lexical and audiovisual (or lipreading) information can be a helpful source of guidance. These two types of information embedded in speech can also guide perceptual adjustment, also known as recalibration or perceptual retuning. With retuning or recalibration, listeners can use these contextual cues to temporarily or permanently reconfigure internal representations of phoneme categories to adjust to and understand novel interlocutors more easily. These two types of perceptual learning, previously investigated in large part separately, are highly similar in allowing listeners to use speech-external information to make phoneme boundary adjustments. This study explored whether the two sources may work in conjunction to induce adaptation, thus emulating real life, in which listeners are indeed likely to encounter both types of cue together. Listeners who received combined audiovisual and lexical cues showed perceptual learning effects similar to listeners who only received audiovisual cues, while listeners who received only lexical cues showed weaker effects compared with the two other groups. The combination of cues did not lead to additive retuning or recalibration effects, suggesting that lexical and audiovisual cues operate differently with regard to how listeners use them for reshaping perceptual categories. Reaction times did not significantly differ across the three conditions, so none of the forms of adjustment were either aided or hindered by processing time differences. Mechanisms underlying these forms of perceptual learning may diverge in numerous ways despite similarities in experimental applications.

    Additional information

    Data and materials
  • Ullas, S., Formisano, E., Eisner, F., & Cutler, A. (2020). Interleaved lexical and audiovisual information can retune phoneme boundaries. Attention, Perception & Psychophysics, 82, 2018-2026. doi:10.3758/s13414-019-01961-8.

    Abstract

    To adapt to situations in which speech perception is difficult, listeners can adjust boundaries between phoneme categories using perceptual learning. Such adjustments can draw on lexical information in surrounding speech, or on visual cues via speech-reading. In the present study, listeners proved they were able to flexibly adjust the boundary between two plosive/stop consonants, /p/-/t/, using both lexical and speech-reading information and given the same experimental design for both cue types. Videos of a speaker pronouncing pseudo-words and audio recordings of Dutch words were presented in alternating blocks of either stimulus type. Listeners were able to switch between cues to adjust phoneme boundaries, and resulting effects were comparable to results from listeners receiving only a single source of information. Overall, audiovisual cues (i.e., the videos) produced the stronger effects, commensurate with their applicability for adapting to noisy environments. Lexical cues were able to induce effects with fewer exposure stimuli and a changing phoneme bias, in a design unlike most prior studies of lexical retuning. While lexical retuning effects were relatively weaker compared to audiovisual recalibration, this discrepancy could reflect how lexical retuning may be more suitable for adapting to speakers than to environments. Nonetheless, the presence of the lexical retuning effects suggests that it may be invoked at a faster rate than previously seen. In general, this technique has further illuminated the robustness of adaptability in speech perception, and offers the potential to enable further comparisons across differing forms of perceptual learning.
  • Ullas, S., Hausfeld, L., Cutler, A., Eisner, F., & Formisano, E. (2020). Neural correlates of phonetic adaptation as induced by lexical and audiovisual context. Journal of Cognitive Neuroscience, 32(11), 2145-2158. doi:10.1162/jocn_a_01608.

    Abstract

    When speech perception is difficult, one way listeners adjust is by reconfiguring phoneme category boundaries, drawing on contextual information. Both lexical knowledge and lipreading cues are used in this way, but it remains unknown whether these two differing forms of perceptual learning are similar at a neural level. This study compared phoneme boundary adjustments driven by lexical or audiovisual cues, using ultra-high-field 7-T fMRI. During imaging, participants heard exposure stimuli and test stimuli. Exposure stimuli for lexical retuning were audio recordings of words, and those for audiovisual recalibration were audio–video recordings of lip movements during utterances of pseudowords. Test stimuli were ambiguous phonetic strings presented without context, and listeners reported what phoneme they heard. Reports reflected phoneme biases in preceding exposure blocks (e.g., more reported /p/ after /p/-biased exposure). Analysis of corresponding brain responses indicated that both forms of cue use were associated with a network of activity across the temporal cortex, plus parietal, insula, and motor areas. Audiovisual recalibration also elicited significant occipital cortex activity despite the lack of visual stimuli. Activity levels in several ROIs also covaried with strength of audiovisual recalibration, with greater activity accompanying larger recalibration shifts. Similar activation patterns appeared for lexical retuning, but here, no significant ROIs were identified. Audiovisual and lexical forms of perceptual learning thus induce largely similar brain response patterns. However, audiovisual recalibration involves additional visual cortex contributions, suggesting that previously acquired visual information (on lip movements) is retrieved and deployed to disambiguate auditory perception.
  • Cutler, A. (1989). Auditory lexical access: Where do we start? In W. Marslen-Wilson (Ed.), Lexical representation and process (pp. 342-356). Cambridge, MA: MIT Press.

    Abstract

    The lexicon, considered as a component of the process of recognizing speech, is a device that accepts a sound image as input and outputs meaning. Lexical access is the process of formulating an appropriate input and mapping it onto an entry in the lexicon's store of sound images matched with their meanings. This chapter addresses the problems of auditory lexical access from continuous speech. The central argument to be proposed is that utterance prosody plays a crucial role in the access process. Continuous listening faces problems that are not present in visual recognition (reading) or in noncontinuous recognition (understanding isolated words). Aspects of utterance prosody offer a solution to these particular problems.
  • Cutler, A., & Butterfield, S. (1989). Natural speech cues to word segmentation under difficult listening conditions. In J. Tubach, & J. Mariani (Eds.), Proceedings of Eurospeech 89: European Conference on Speech Communication and Technology: Vol. 2 (pp. 372-375). Edinburgh: CEP Consultants.

    Abstract

    One of a listener's major tasks in understanding continuous speech is segmenting the speech signal into separate words. When listening conditions are difficult, speakers can help listeners by deliberately speaking more clearly. In three experiments, we examined how word boundaries are produced in deliberately clear speech. We found that speakers do indeed attempt to mark word boundaries; moreover, they differentiate between word boundaries in a way which suggests they are sensitive to listener needs. Application of heuristic segmentation strategies makes word boundaries before strong syllables easiest for listeners to perceive; but under difficult listening conditions speakers pay more attention to marking word boundaries before weak syllables, i.e. they mark those boundaries which are otherwise particularly hard to perceive.
  • Cutler, A., Howard, D., & Patterson, K. E. (1989). Misplaced stress on prosody: A reply to Black and Byng. Cognitive Neuropsychology, 6, 67-83.

    Abstract

    The recent claim by Black and Byng (1986) that lexical access in reading is subject to prosodic constraints is examined and found to be unsupported. The evidence from impaired reading which Black and Byng report is based on poorly controlled stimulus materials and is inadequately analysed and reported. An alternative explanation of their findings is proposed, and new data are reported for which this alternative explanation can account but their model cannot. Finally, their proposal is shown to be theoretically unmotivated and in conflict with evidence from normal reading.
  • Cutler, A. (1989). Straw modules [Commentary/Massaro: Speech perception]. Behavioral and Brain Sciences, 12, 760-762.
  • Cutler, A. (1989). The new Victorians. New Scientist, (1663), 66.
  • Patterson, R. D., & Cutler, A. (1989). Auditory preprocessing and recognition of speech. In A. Baddeley, & N. Bernsen (Eds.), Research directions in cognitive science: A european perspective: Vol. 1. Cognitive psychology (pp. 23-60). London: Erlbaum.
  • Smith, M. R., Cutler, A., Butterfield, S., & Nimmo-Smith, I. (1989). The perception of rhythm and word boundaries in noise-masked speech. Journal of Speech and Hearing Research, 32, 912-920.

    Abstract

    The present experiment tested the suggestion that human listeners may exploit durational information in speech to parse continuous utterances into words. Listeners were presented with six-syllable unpredictable utterances under noise-masking, and were required to judge between alternative word strings as to which best matched the rhythm of the masked utterances. For each utterance there were four alternative strings: (a) an exact rhythmic and word boundary match, (b) a rhythmic mismatch, and (c) two utterances with the same rhythm as the masked utterance, but different word boundary locations. Listeners were clearly able to perceive the rhythm of the masked utterances: The rhythmic mismatch was chosen significantly less often than any other alternative. Within the three rhythmically matched alternatives, the exact match was chosen significantly more often than either word boundary mismatch. Thus, listeners both perceived speech rhythm and used durational cues effectively to locate the position of word boundaries.
  • Cutler, A. (1979). Beyond parsing and lexical look-up. In R. J. Wales, & E. C. T. Walker (Eds.), New approaches to language mechanisms: a collection of psycholinguistic studies (pp. 133-149). Amsterdam: North-Holland.
  • Cutler, A. (1979). Contemporary reaction to Rudolf Meringer’s speech error research. Historiograpia Linguistica, 6, 57-76.
  • Cutler, A., & Norris, D. (1979). Monitoring sentence comprehension. In W. E. Cooper, & E. C. T. Walker (Eds.), Sentence processing: Psycholinguistic studies presented to Merrill Garrett (pp. 113-134). Hillsdale: Erlbaum.
  • Swinney, D. A., & Cutler, A. (1979). The access and processing of idiomatic expressions. Journal of Verbal Learning an Verbal Behavior, 18, 523-534. doi:10.1016/S0022-5371(79)90284-6.

    Abstract

    Two experiments examined the nature of access, storage, and comprehension of idiomatic phrases. In both studies a Phrase Classification Task was utilized. In this, reaction times to determine whether or not word strings constituted acceptable English phrases were measured. Classification times were significantly faster to idiom than to matched control phrases. This effect held under conditions involving different categories of idioms, different transitional probabilities among words in the phrases, and different levels of awareness of the presence of idioms in the materials. The data support a Lexical Representation Hypothesis for the processing of idioms.

Share this page