Anne Cutler

Publications

Displaying 1 - 14 of 14
  • Bruggeman, L., & Cutler, A. (2020). No L1 privilege in talker adaptation. Bilingualism: Language and Cognition, 23(3), 681-693. doi:10.1017/S1366728919000646.

    Abstract

    As a rule, listening is easier in first (L1) than second languages (L2); difficult L2 listening can challenge even highly proficient users. We here examine one particular listening function, adaptation to novel talkers, in such a high-proficiency population: Dutch emigrants to Australia, predominantly using English outside the family, but all also retaining L1 proficiency. Using lexically-guided perceptual learning (Norris, McQueen & Cutler, 2003), we investigated these listeners’ adaptation to an ambiguous speech sound, in parallel experiments in both their L1 and their L2. A control study established that perceptual learning outcomes were unaffected by the procedural measures required for this double comparison. The emigrants showed equivalent proficiency in tests in both languages, robust perceptual adaptation in their L2, English, but no adaptation in L1. We propose that adaptation to novel talkers is a language-specific skill requiring regular novel practice; a limited set of known (family) interlocutors cannot meet this requirement.
  • Ip, M. H. K., & Cutler, A. (2020). Universals of listening: Equivalent prosodic entrainment in tone and non-tone languages. Cognition, 202: 104311. doi:10.1016/j.cognition.2020.104311.

    Abstract

    In English and Dutch, listeners entrain to prosodic contours to predict where focus will fall in an utterance. Here, we ask whether this strategy is universally available, even in languages with very different phonological systems (e.g., tone versus non-tone languages). In a phoneme detection experiment, we examined whether prosodic entrainment also occurs in Mandarin Chinese, a tone language, where the use of various suprasegmental cues to lexical identity may take precedence over their use in salience. Consistent with the results from Germanic languages, response times were facilitated when preceding intonation predicted high stress on the target-bearing word, and the lexical tone of the target word (i.e., rising versus falling) did not affect the Mandarin listeners' response. Further, the extent to which prosodic entrainment was used to detect the target phoneme was the same in both English and Mandarin listeners. Nevertheless, native Mandarin speakers did not adopt an entrainment strategy when the sentences were presented in English, consistent with the suggestion that L2 listening may be strained by additional functional load from prosodic processing. These findings have implications for how universal and language-specific mechanisms interact in the perception of focus structure in everyday discourse.

    Additional information

    supplementary data
  • Mandal, S., Best, C. T., Shaw, J., & Cutler, A. (2020). Bilingual phonology in dichotic perception: A case study of Malayalam and English voicing. Glossa: A Journal of General Linguistics, 5(1): 73. doi:10.5334/gjgl.853.

    Abstract

    Listeners often experience cocktail-party situations, encountering multiple ongoing conversa- tions while tracking just one. Capturing the words spoken under such conditions requires selec- tive attention and processing, which involves using phonetic details to discern phonological structure. How do bilinguals accomplish this in L1-L2 competition? We addressed that question using a dichotic listening task with fluent Malayalam-English bilinguals, in which they were pre- sented with synchronized nonce words, one in each language in separate ears, with competing onsets of a labial stop (Malayalam) and a labial fricative (English), both voiced or both voiceless. They were required to attend to the Malayalam or the English item, in separate blocks, and report the initial consonant they heard. We found that perceptual intrusions from the unattended to the attended language were influenced by voicing, with more intrusions on voiced than voiceless tri- als. This result supports our proposal for the feature specification of consonants in Malayalam- English bilinguals, which makes use of privative features, underspecification and the “standard approach” to laryngeal features, as against “laryngeal realism”. Given this representational account, we observe that intrusions result from phonetic properties in the unattended signal being assimilated to the closest matching phonological category in the attended language, and are more likely for segments with a greater number of phonological feature specifications.
  • Ullas, S., Formisano, E., Eisner, F., & Cutler, A. (2020). Audiovisual and lexical cues do not additively enhance perceptual adaptation. Psychonomic Bulletin & Review, 27, 707-715. doi:10.3758/s13423-020-01728-5.

    Abstract

    When listeners experience difficulty in understanding a speaker, lexical and audiovisual (or lipreading) information can be a helpful source of guidance. These two types of information embedded in speech can also guide perceptual adjustment, also known as recalibration or perceptual retuning. With retuning or recalibration, listeners can use these contextual cues to temporarily or permanently reconfigure internal representations of phoneme categories to adjust to and understand novel interlocutors more easily. These two types of perceptual learning, previously investigated in large part separately, are highly similar in allowing listeners to use speech-external information to make phoneme boundary adjustments. This study explored whether the two sources may work in conjunction to induce adaptation, thus emulating real life, in which listeners are indeed likely to encounter both types of cue together. Listeners who received combined audiovisual and lexical cues showed perceptual learning effects similar to listeners who only received audiovisual cues, while listeners who received only lexical cues showed weaker effects compared with the two other groups. The combination of cues did not lead to additive retuning or recalibration effects, suggesting that lexical and audiovisual cues operate differently with regard to how listeners use them for reshaping perceptual categories. Reaction times did not significantly differ across the three conditions, so none of the forms of adjustment were either aided or hindered by processing time differences. Mechanisms underlying these forms of perceptual learning may diverge in numerous ways despite similarities in experimental applications.

    Additional information

    Data and materials
  • Ullas, S., Formisano, E., Eisner, F., & Cutler, A. (2020). Interleaved lexical and audiovisual information can retune phoneme boundaries. Attention, Perception & Psychophysics, 82, 2018-2026. doi:10.3758/s13414-019-01961-8.

    Abstract

    To adapt to situations in which speech perception is difficult, listeners can adjust boundaries between phoneme categories using perceptual learning. Such adjustments can draw on lexical information in surrounding speech, or on visual cues via speech-reading. In the present study, listeners proved they were able to flexibly adjust the boundary between two plosive/stop consonants, /p/-/t/, using both lexical and speech-reading information and given the same experimental design for both cue types. Videos of a speaker pronouncing pseudo-words and audio recordings of Dutch words were presented in alternating blocks of either stimulus type. Listeners were able to switch between cues to adjust phoneme boundaries, and resulting effects were comparable to results from listeners receiving only a single source of information. Overall, audiovisual cues (i.e., the videos) produced the stronger effects, commensurate with their applicability for adapting to noisy environments. Lexical cues were able to induce effects with fewer exposure stimuli and a changing phoneme bias, in a design unlike most prior studies of lexical retuning. While lexical retuning effects were relatively weaker compared to audiovisual recalibration, this discrepancy could reflect how lexical retuning may be more suitable for adapting to speakers than to environments. Nonetheless, the presence of the lexical retuning effects suggests that it may be invoked at a faster rate than previously seen. In general, this technique has further illuminated the robustness of adaptability in speech perception, and offers the potential to enable further comparisons across differing forms of perceptual learning.
  • Ullas, S., Hausfeld, L., Cutler, A., Eisner, F., & Formisano, E. (2020). Neural correlates of phonetic adaptation as induced by lexical and audiovisual context. Journal of Cognitive Neuroscience, 32(11), 2145-2158. doi:10.1162/jocn_a_01608.

    Abstract

    When speech perception is difficult, one way listeners adjust is by reconfiguring phoneme category boundaries, drawing on contextual information. Both lexical knowledge and lipreading cues are used in this way, but it remains unknown whether these two differing forms of perceptual learning are similar at a neural level. This study compared phoneme boundary adjustments driven by lexical or audiovisual cues, using ultra-high-field 7-T fMRI. During imaging, participants heard exposure stimuli and test stimuli. Exposure stimuli for lexical retuning were audio recordings of words, and those for audiovisual recalibration were audio–video recordings of lip movements during utterances of pseudowords. Test stimuli were ambiguous phonetic strings presented without context, and listeners reported what phoneme they heard. Reports reflected phoneme biases in preceding exposure blocks (e.g., more reported /p/ after /p/-biased exposure). Analysis of corresponding brain responses indicated that both forms of cue use were associated with a network of activity across the temporal cortex, plus parietal, insula, and motor areas. Audiovisual recalibration also elicited significant occipital cortex activity despite the lack of visual stimuli. Activity levels in several ROIs also covaried with strength of audiovisual recalibration, with greater activity accompanying larger recalibration shifts. Similar activation patterns appeared for lexical retuning, but here, no significant ROIs were identified. Audiovisual and lexical forms of perceptual learning thus induce largely similar brain response patterns. However, audiovisual recalibration involves additional visual cortex contributions, suggesting that previously acquired visual information (on lip movements) is retrieved and deployed to disambiguate auditory perception.
  • Cutler, A., & Norris, D. (2016). Bottoms up! How top-down pitfalls ensnare speech perception researchers too. Commentary on C. Firestone & B. Scholl: Cognition does not affect perception: Evaluating the evidence for 'top-down' effects. Behavioral and Brain Sciences, e236. doi:10.1017/S0140525X15002745.

    Abstract

    Not only can the pitfalls that Firestone & Scholl (F&S) identify be generalised across multiple studies within the field of visual perception, but also they have general application outside the field wherever perceptual and cognitive processing are compared. We call attention to the widespread susceptibility of research on the perception of speech to versions of the same pitfalls.
  • Norris, D., McQueen, J. M., & Cutler, A. (2016). Prediction, Bayesian inference and feedback in speech recognition. Language, Cognition and Neuroscience, 31(1), 4-18. doi:10.1080/23273798.2015.1081703.

    Abstract

    Speech perception involves prediction, but how is that prediction implemented? In cognitive models prediction has often been taken to imply that there is feedback of activation from lexical to pre-lexical processes as implemented in interactive-activation models (IAMs). We show that simple activation feedback does not actually improve speech recognition. However, other forms of feedback can be beneficial. In particular, feedback can enable the listener to adapt to changing input, and can potentially help the listener to recognise unusual input, or recognise speech in the presence of competing sounds. The common feature of these helpful forms of feedback is that they are all ways of optimising the performance of speech recognition using Bayesian inference. That is, listeners make predictions about speech because speech recognition is optimal in the sense captured in Bayesian models.
  • Cutler, A. (2010). Abstraction-based efficiency in the lexicon. Laboratory Phonology, 1(2), 301-318. doi:10.1515/LABPHON.2010.016.

    Abstract

    Listeners learn from their past experience of listening to spoken words, and use this learning to maximise the efficiency of future word recognition. This paper summarises evidence that the facilitatory effects of drawing on past experience are mediated by abstraction, enabling learning to be generalised across new words and new listening situations. Phoneme category retuning, which allows adaptation to speaker-specific articulatory characteristics, is generalised on the basis of relatively brief experience to words previously unheard from that speaker. Abstract knowledge of prosodic regularities is applied to recognition even of novel words for which these regularities were violated. Prosodic word-boundary regularities drive segmentation of speech into words independently of the membership of the lexical candidate set resulting from the segmentation operation. Each of these different cases illustrates how abstraction from past listening experience has contributed to the efficiency of lexical recognition.
  • Cutler, A., Cooke, M., & Lecumberri, M. L. G. (2010). Preface. Speech Communication, 52, 863. doi:10.1016/j.specom.2010.11.003.

    Abstract

    Adverse listening conditions always make the perception of speech harder, but their deleterious effect is far greater if the speech we are trying to understand is in a non-native language. An imperfect signal can be coped with by recourse to the extensive knowledge one has of a native language, and imperfect knowledge of a non-native language can still support useful communication when speech signals are high-quality. But the combination of imperfect signal and imperfect knowledge leads rapidly to communication breakdown. This phenomenon is undoubtedly well known to every reader of Speech Communication from personal experience. Many readers will also have a professional interest in explaining, or remedying, the problems it produces. The journal’s readership being a decidedly interdisciplinary one, this interest will involve quite varied scientific approaches, including (but not limited to) modelling the interaction of first and second language vocabularies and phonemic repertoires, developing targeted listening training for language learners, and redesigning the acoustics of classrooms and conference halls. In other words, the phenomenon that this special issue deals with is a well-known one, that raises important scientific and practical questions across a range of speech communication disciplines, and Speech Communication is arguably the ideal vehicle for presentation of such a breadth of approaches in a single volume. The call for papers for this issue elicited a large number of submissions from across the full range of the journal’s interdisciplinary scope, requiring the guest editors to apply very strict criteria to the final selection. Perhaps unique in the history of treatments of this topic is the combination represented by the guest editors for this issue: a phonetician whose primary research interest is in second-language speech (MLGL), an engineer whose primary research field is the acoustics of masking in speech processing (MC), and a psychologist whose primary research topic is the recognition of spoken words (AC). In the opening article of the issue, these three authors together review the existing literature on listening to second-language speech under adverse conditions, bringing together these differing perspectives for the first time in a single contribution. The introductory review is followed by 13 new experimental reports of phonetic, acoustic and psychological studies of the topic. The guest editors thank Speech Communication editor Marc Swerts and the journal’s team at Elsevier, as well as all the reviewers who devoted time and expert efforts to perfecting the contributions to this issue.
  • Cutler, A., Treiman, R., & Van Ooijen, B. (2010). Strategic deployment of orthographic knowledge in phoneme detection. Language and Speech, 53(3), 307 -320. doi:10.1177/0023830910371445.

    Abstract

    The phoneme detection task is widely used in spoken-word recognition research. Alphabetically literate participants, however, are more used to explicit representations of letters than of phonemes. The present study explored whether phoneme detection is sensitive to how target phonemes are, or may be, orthographically realized. Listeners detected the target sounds [b, m, t, f, s, k] in word-initial position in sequences of isolated English words. Response times were faster to the targets [b, m, t], which have consistent word-initial spelling, than to the targets [f, s, k], which are inconsistently spelled, but only when spelling was rendered salient by the presence in the experiment of many irregularly spelled filler words. Within the inconsistent targets [f, s, k], there was no significant difference between responses to targets in words with more usual (foam, seed, cattle) versus less usual (phone, cede, kettle) spellings. Phoneme detection is thus not necessarily sensitive to orthographic effects; knowledge of spelling stored in the lexical representations of words does not automatically become available as word candidates are activated. However, salient orthographic manipulations in experimental input can induce such sensitivity. We attribute this to listeners' experience of the value of spelling in everyday situations that encourage phonemic decisions (such as learning new names)
  • Lecumberri, M. L. G., Cooke, M., & Cutler, A. (2010). Non-native speech perception in adverse conditions: A review. Speech Communication, 52, 864-886. doi:10.1016/j.specom.2010.08.014.

    Abstract

    If listening in adverse conditions is hard, then listening in a foreign language is doubly so: non-native listeners have to cope with both imperfect signals and imperfect knowledge. Comparison of native and non-native listener performance in speech-in-noise tasks helps to clarify the role of prior linguistic experience in speech perception, and, more directly, contributes to an understanding of the problems faced by language learners in everyday listening situations. This article reviews experimental studies on non-native listening in adverse conditions, organised around three principal contributory factors: the task facing listeners, the effect of adverse conditions on speech, and the differences among listener populations. Based on a comprehensive tabulation of key studies, we identify robust findings, research trends and gaps in current knowledge.
  • Cutler, A., Mehler, J., Norris, D., & Segui, J. (1983). A language-specific comprehension strategy [Letters to Nature]. Nature, 304, 159-160. doi:10.1038/304159a0.

    Abstract

    Infants acquire whatever language is spoken in the environment into which they are born. The mental capability of the newborn child is not biased in any way towards the acquisition of one human language rather than another. Because psychologists who attempt to model the process of language comprehension are interested in the structure of the human mind, rather than in the properties of individual languages, strategies which they incorporate in their models are presumed to be universal, not language-specific. In other words, strategies of comprehension are presumed to be characteristic of the human language processing system, rather than, say, the French, English, or Igbo language processing systems. We report here, however, on a comprehension strategy which appears to be used by native speakers of French but not by native speakers of English.
  • Levelt, W. J. M., & Cutler, A. (1983). Prosodic marking in speech repair. Journal of semantics, 2, 205-217. doi:10.1093/semant/2.2.205.

    Abstract

    Spontaneous self-corrections in speech pose a communication problem; the speaker must make clear to the listener not only that the original Utterance was faulty, but where it was faulty and how the fault is to be corrected. Prosodic marking of corrections - making the prosody of the repair noticeably different from that of the original utterance - offers a resource which the speaker can exploit to provide the listener with such information. A corpus of more than 400 spontaneous speech repairs was analysed, and the prosodic characteristics compared with the syntactic and semantic characteristics of each repair. Prosodic marking showed no relationship at all with the syntactic characteristics of repairs. Instead, marking was associated with certain semantic factors: repairs were marked when the original utterance had been actually erroneous, rather than simply less appropriate than the repair; and repairs tended to be marked more often when the set of items encompassing the error and the repair was small rather than when it was large. These findings lend further weight to the characterization of accent as essentially semantic in function.

Share this page