Anne Cutler

Publications

Displaying 1 - 8 of 8
  • Asano, Y., Yuan, C., Grohe, A.-K., Weber, A., Antoniou, M., & Cutler, A. (2020). Uptalk interpretation as a function of listening experience. In N. Minematsu, M. Kondo, T. Arai, & R. Hayashi (Eds.), Proceedings of Speech Prosody 2020 (pp. 735-739). Tokyo: ISCA. doi:10.21437/SpeechProsody.2020-150.

    Abstract

    The term “uptalk” describes utterance-final pitch rises that carry no sentence-structural information. Uptalk is usually dialectal or sociolectal, and Australian English (AusEng) is particularly known for this attribute. We ask here whether experience with an uptalk variety affects listeners’ ability to categorise rising pitch contours on the basis of the timing and height of their onset and offset. Listeners were two groups of English-speakers (AusEng, and American English), and three groups of listeners with L2 English: one group with Mandarin as L1 and experience of listening to AusEng, one with German as L1 and experience of listening to AusEng, and one with German as L1 but no AusEng experience. They heard nouns (e.g. flower, piano) in the framework “Got a NOUN”, each ending with a pitch rise artificially manipulated on three contrasts: low vs. high rise onset, low vs. high rise offset and early vs. late rise onset. Their task was to categorise the tokens as “question” or “statement”, and we analysed the effect of the pitch contrasts on their judgements. Only the native AusEng listeners were able to use the pitch contrasts systematically in making these categorisations.
  • Yu, J., Mailhammer, R., & Cutler, A. (2020). Vocabulary structure affects word recognition: Evidence from German listeners. In N. Minematsu, M. Kondo, T. Arai, & R. Hayashi (Eds.), Proceedings of Speech Prosody 2020 (pp. 474-478). Tokyo: ISCA. doi:10.21437/SpeechProsody.2020-97.

    Abstract

    Lexical stress is realised similarly in English, German, and Dutch. On a suprasegmental level, stressed syllables tend to be longer and more acoustically salient than unstressed syllables; segmentally, vowels in unstressed syllables are often reduced. The frequency of unreduced unstressed syllables (where only the suprasegmental cues indicate lack of stress) however, differs across the languages. The present studies test whether listener behaviour is affected by these vocabulary differences, by investigating German listeners’ use of suprasegmental cues to lexical stress in German and English word recognition. In a forced-choice identification task, German listeners correctly assigned single-syllable fragments (e.g., Kon-) to one of two words differing in stress (KONto, konZEPT). Thus, German listeners can exploit suprasegmental information for identifying words. German listeners also performed above chance in a similar task in English (with, e.g., DIver, diVERT), i.e., their sensitivity to these cues also transferred to a nonnative language. An English listener group, in contrast, failed in the English fragment task. These findings mirror vocabulary patterns: German has more words with unreduced unstressed syllables than English does.
  • Burchfield, L. A., Luk, S.-.-H.-K., Antoniou, M., & Cutler, A. (2017). Lexically guided perceptual learning in Mandarin Chinese. In Proceedings of Interspeech 2017 (pp. 576-580). doi:10.21437/Interspeech.2017-618.

    Abstract

    Lexically guided perceptual learni ng refers to the use of lexical knowledge to retune sp eech categories and thereby adapt to a novel talker’s pronunciation. This adaptation has been extensively documented, but primarily for segmental-based learning in English and Dutch. In languages with lexical tone, such as Mandarin Chinese, tonal categories can also be retuned in this way, but segmental category retuning had not been studied. We report two experiment s in which Mandarin Chinese listeners were exposed to an ambiguous mixture of [f] and [s] in lexical contexts favoring an interpretation as either [f] or [s]. Listeners were subsequently more likely to identify sounds along a continuum between [f] and [s], and to interpret minimal word pairs, in a manner consistent with this exposure. Thus lexically guided perceptual learning of segmental categories had indeed taken place, consistent with suggestions that such learning may be a universally available adaptation process
  • Cutler, A. (2017). Converging evidence for abstract phonological knowledge in speech processing. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 1447-1448). Austin, TX: Cognitive Science Society.

    Abstract

    The perceptual processing of speech is a constant interplay of multiple competing albeit convergent processes: acoustic input vs. higher-level representations, universal mechanisms vs. language-specific, veridical traces of speech experience vs. construction and activation of abstract representations. The present summary concerns the third of these issues. The ability to generalise across experience and to deal with resulting abstractions is the hallmark of human cognition, visible even in early infancy. In speech processing, abstract representations play a necessary role in both production and perception. New sorts of evidence are now informing our understanding of the breadth of this role.
  • Ip, M. H. K., & Cutler, A. (2017). Intonation facilitates prediction of focus even in the presence of lexical tones. In Proceedings of Interspeech 2017 (pp. 1218-1222). doi:10.21437/Interspeech.2017-264.

    Abstract

    In English and Dutch, listeners entrain to prosodic contours to predict where focus will fall in an utterance. However, is this strategy universally available, even in languages with different phonological systems? In a phoneme detection experiment, we examined whether prosodic entrainment is also found in Mandarin Chinese, a tone language, where in principle the use of pitch for lexical identity may take precedence over the use of pitch cues to salience. Consistent with the results from Germanic languages, response times were facilitated when preceding intonation predicted accent on the target-bearing word. Acoustic analyses revealed greater F0 range in the preceding intonation of the predicted-accent sentences. These findings have implications for how universal and language-specific mechanisms interact in the processing of salience.
  • Kember, H., Grohe, A.-.-K., Zahner, K., Braun, B., Weber, A., & Cutler, A. (2017). Similar prosodic structure perceived differently in German and English. In Proceedings of Interspeech 2017 (pp. 1388-1392).

    Abstract

    English and German have similar prosody, but their speakers realize some pitch falls (not rises) in subtly different ways. We here test for asymmetry in perception. An ABX discrimination task requiring F0 slope or duration judgements on isolated vowels revealed no cross-language difference in duration or F0 fall discrimination, but discrimination of rises (realized similarly in each language) was less accurate for English than for German listeners. This unexpected finding may reflect greater sensitivity to rising patterns by German listeners, or reduced sensitivity by English listeners as a result of extensive exposure to phrase-final rises (“uptalk”) in their language
  • Cutler, A. (1987). Components of prosodic effects in speech recognition. In Proceedings of the Eleventh International Congress of Phonetic Sciences: Vol. 1 (pp. 84-87). Tallinn: Academy of Sciences of the Estonian SSR, Institute of Language and Literature.

    Abstract

    Previous research has shown that listeners use the prosodic structure of utterances in a predictive fashion in sentence comprehension, to direct attention to accented words. Acoustically identical words spliced into sentence contexts arc responded to differently if the prosodic structure of the context is \ aricd: when the preceding prosody indicates that the word will he accented, responses are faster than when the preceding prosodv is inconsistent with accent occurring on that word. In the present series of experiments speech hybridisation techniques were first used to interchange the timing patterns within pairs of prosodic variants of utterances, independently of the pitch and intensity contours. The time-adjusted utterances could then serve as a basis lor the orthogonal manipulation of the three prosodic dimensions of pilch, intensity and rhythm. The overall pattern of results showed that when listeners use prosody to predict accent location, they do not simply rely on a single prosodic dimension, hut exploit the interaction between pitch, intensity and rhythm.
  • Cutler, A., & Carter, D. (1987). The prosodic structure of initial syllables in English. In J. Laver, & M. Jack (Eds.), Proceedings of the European Conference on Speech Technology: Vol. 1 (pp. 207-210). Edinburgh: IEE.

Share this page