Anne Cutler

Publications

Displaying 1 - 16 of 16
  • Bruggeman, L., & Cutler, A. (2020). No L1 privilege in talker adaptation. Bilingualism: Language and Cognition, 23(3), 681-693. doi:10.1017/S1366728919000646.

    Abstract

    As a rule, listening is easier in first (L1) than second languages (L2); difficult L2 listening can challenge even highly proficient users. We here examine one particular listening function, adaptation to novel talkers, in such a high-proficiency population: Dutch emigrants to Australia, predominantly using English outside the family, but all also retaining L1 proficiency. Using lexically-guided perceptual learning (Norris, McQueen & Cutler, 2003), we investigated these listeners’ adaptation to an ambiguous speech sound, in parallel experiments in both their L1 and their L2. A control study established that perceptual learning outcomes were unaffected by the procedural measures required for this double comparison. The emigrants showed equivalent proficiency in tests in both languages, robust perceptual adaptation in their L2, English, but no adaptation in L1. We propose that adaptation to novel talkers is a language-specific skill requiring regular novel practice; a limited set of known (family) interlocutors cannot meet this requirement.
  • Ip, M. H. K., & Cutler, A. (2020). Universals of listening: Equivalent prosodic entrainment in tone and non-tone languages. Cognition, 202: 104311. doi:10.1016/j.cognition.2020.104311.

    Abstract

    In English and Dutch, listeners entrain to prosodic contours to predict where focus will fall in an utterance. Here, we ask whether this strategy is universally available, even in languages with very different phonological systems (e.g., tone versus non-tone languages). In a phoneme detection experiment, we examined whether prosodic entrainment also occurs in Mandarin Chinese, a tone language, where the use of various suprasegmental cues to lexical identity may take precedence over their use in salience. Consistent with the results from Germanic languages, response times were facilitated when preceding intonation predicted high stress on the target-bearing word, and the lexical tone of the target word (i.e., rising versus falling) did not affect the Mandarin listeners' response. Further, the extent to which prosodic entrainment was used to detect the target phoneme was the same in both English and Mandarin listeners. Nevertheless, native Mandarin speakers did not adopt an entrainment strategy when the sentences were presented in English, consistent with the suggestion that L2 listening may be strained by additional functional load from prosodic processing. These findings have implications for how universal and language-specific mechanisms interact in the perception of focus structure in everyday discourse.

    Additional information

    supplementary data
  • Mandal, S., Best, C. T., Shaw, J., & Cutler, A. (2020). Bilingual phonology in dichotic perception: A case study of Malayalam and English voicing. Glossa: A Journal of General Linguistics, 5(1): 73. doi:10.5334/gjgl.853.

    Abstract

    Listeners often experience cocktail-party situations, encountering multiple ongoing conversa- tions while tracking just one. Capturing the words spoken under such conditions requires selec- tive attention and processing, which involves using phonetic details to discern phonological structure. How do bilinguals accomplish this in L1-L2 competition? We addressed that question using a dichotic listening task with fluent Malayalam-English bilinguals, in which they were pre- sented with synchronized nonce words, one in each language in separate ears, with competing onsets of a labial stop (Malayalam) and a labial fricative (English), both voiced or both voiceless. They were required to attend to the Malayalam or the English item, in separate blocks, and report the initial consonant they heard. We found that perceptual intrusions from the unattended to the attended language were influenced by voicing, with more intrusions on voiced than voiceless tri- als. This result supports our proposal for the feature specification of consonants in Malayalam- English bilinguals, which makes use of privative features, underspecification and the “standard approach” to laryngeal features, as against “laryngeal realism”. Given this representational account, we observe that intrusions result from phonetic properties in the unattended signal being assimilated to the closest matching phonological category in the attended language, and are more likely for segments with a greater number of phonological feature specifications.
  • Ullas, S., Formisano, E., Eisner, F., & Cutler, A. (2020). Audiovisual and lexical cues do not additively enhance perceptual adaptation. Psychonomic Bulletin & Review, 27, 707-715. doi:10.3758/s13423-020-01728-5.

    Abstract

    When listeners experience difficulty in understanding a speaker, lexical and audiovisual (or lipreading) information can be a helpful source of guidance. These two types of information embedded in speech can also guide perceptual adjustment, also known as recalibration or perceptual retuning. With retuning or recalibration, listeners can use these contextual cues to temporarily or permanently reconfigure internal representations of phoneme categories to adjust to and understand novel interlocutors more easily. These two types of perceptual learning, previously investigated in large part separately, are highly similar in allowing listeners to use speech-external information to make phoneme boundary adjustments. This study explored whether the two sources may work in conjunction to induce adaptation, thus emulating real life, in which listeners are indeed likely to encounter both types of cue together. Listeners who received combined audiovisual and lexical cues showed perceptual learning effects similar to listeners who only received audiovisual cues, while listeners who received only lexical cues showed weaker effects compared with the two other groups. The combination of cues did not lead to additive retuning or recalibration effects, suggesting that lexical and audiovisual cues operate differently with regard to how listeners use them for reshaping perceptual categories. Reaction times did not significantly differ across the three conditions, so none of the forms of adjustment were either aided or hindered by processing time differences. Mechanisms underlying these forms of perceptual learning may diverge in numerous ways despite similarities in experimental applications.

    Additional information

    Data and materials
  • Ullas, S., Formisano, E., Eisner, F., & Cutler, A. (2020). Interleaved lexical and audiovisual information can retune phoneme boundaries. Attention, Perception & Psychophysics, 82, 2018-2026. doi:10.3758/s13414-019-01961-8.

    Abstract

    To adapt to situations in which speech perception is difficult, listeners can adjust boundaries between phoneme categories using perceptual learning. Such adjustments can draw on lexical information in surrounding speech, or on visual cues via speech-reading. In the present study, listeners proved they were able to flexibly adjust the boundary between two plosive/stop consonants, /p/-/t/, using both lexical and speech-reading information and given the same experimental design for both cue types. Videos of a speaker pronouncing pseudo-words and audio recordings of Dutch words were presented in alternating blocks of either stimulus type. Listeners were able to switch between cues to adjust phoneme boundaries, and resulting effects were comparable to results from listeners receiving only a single source of information. Overall, audiovisual cues (i.e., the videos) produced the stronger effects, commensurate with their applicability for adapting to noisy environments. Lexical cues were able to induce effects with fewer exposure stimuli and a changing phoneme bias, in a design unlike most prior studies of lexical retuning. While lexical retuning effects were relatively weaker compared to audiovisual recalibration, this discrepancy could reflect how lexical retuning may be more suitable for adapting to speakers than to environments. Nonetheless, the presence of the lexical retuning effects suggests that it may be invoked at a faster rate than previously seen. In general, this technique has further illuminated the robustness of adaptability in speech perception, and offers the potential to enable further comparisons across differing forms of perceptual learning.
  • Ullas, S., Hausfeld, L., Cutler, A., Eisner, F., & Formisano, E. (2020). Neural correlates of phonetic adaptation as induced by lexical and audiovisual context. Journal of Cognitive Neuroscience, 32(11), 2145-2158. doi:10.1162/jocn_a_01608.

    Abstract

    When speech perception is difficult, one way listeners adjust is by reconfiguring phoneme category boundaries, drawing on contextual information. Both lexical knowledge and lipreading cues are used in this way, but it remains unknown whether these two differing forms of perceptual learning are similar at a neural level. This study compared phoneme boundary adjustments driven by lexical or audiovisual cues, using ultra-high-field 7-T fMRI. During imaging, participants heard exposure stimuli and test stimuli. Exposure stimuli for lexical retuning were audio recordings of words, and those for audiovisual recalibration were audio–video recordings of lip movements during utterances of pseudowords. Test stimuli were ambiguous phonetic strings presented without context, and listeners reported what phoneme they heard. Reports reflected phoneme biases in preceding exposure blocks (e.g., more reported /p/ after /p/-biased exposure). Analysis of corresponding brain responses indicated that both forms of cue use were associated with a network of activity across the temporal cortex, plus parietal, insula, and motor areas. Audiovisual recalibration also elicited significant occipital cortex activity despite the lack of visual stimuli. Activity levels in several ROIs also covaried with strength of audiovisual recalibration, with greater activity accompanying larger recalibration shifts. Similar activation patterns appeared for lexical retuning, but here, no significant ROIs were identified. Audiovisual and lexical forms of perceptual learning thus induce largely similar brain response patterns. However, audiovisual recalibration involves additional visual cortex contributions, suggesting that previously acquired visual information (on lip movements) is retrieved and deployed to disambiguate auditory perception.
  • Choi, J., Broersma, M., & Cutler, A. (2018). Phonetic learning is not enhanced by sequential exposure to more than one language. Linguistic Research, 35(3), 567-581. doi:10.17250/khisli.35.3.201812.006.

    Abstract

    Several studies have documented that international adoptees, who in early years have experienced a change from a language used in their birth country to a new language in an adoptive country, benefit from the limited early exposure to the birth language when relearning that language’s sounds later in life. The adoptees’ relearning advantages have been argued to be conferred by lasting birth-language knowledge obtained from the early exposure. However, it is also plausible to assume that the advantages may arise from adoptees’ superior ability to learn language sounds in general, as a result of their unusual linguistic experience, i.e., exposure to multiple languages in sequence early in life. If this is the case, then the adoptees’ relearning benefits should generalize to previously unheard language sounds, rather than be limited to their birth-language sounds. In the present study, adult Korean adoptees in the Netherlands and matched Dutch-native controls were trained on identifying a Japanese length distinction to which they had never been exposed before. The adoptees and Dutch controls did not differ on any test carried out before, during, or after the training, indicating that observed adoptee advantages for birth-language relearning do not generalize to novel, previously unheard language sounds. The finding thus fails to support the suggestion that birth-language relearning advantages may arise from enhanced ability to learn language sounds in general conferred by early experience in multiple languages. Rather, our finding supports the original contention that such advantages involve memory traces obtained before adoption
  • Johnson, E. K., Bruggeman, L., & Cutler, A. (2018). Abstraction and the (misnamed) language familiarity effect. Cognitive Science, 42, 633-645. doi:10.1111/cogs.12520.

    Abstract

    Talkers are recognized more accurately if they are speaking the listeners’ native language rather than an unfamiliar language. This “language familiarity effect” has been shown not to depend upon comprehension and must instead involve language sound patterns. We further examine the level of sound-pattern processing involved, by comparing talker recognition in foreign languages versus two varieties of English, by (a) English speakers of one variety, (b) English speakers of the other variety, and (c) non-native listeners (more familiar with one of the varieties). All listener groups performed better with native than foreign speech, but no effect of language variety appeared: Native listeners discriminated talkers equally well in each, with the native variety never outdoing the other variety, and non-native listeners discriminated talkers equally poorly in each, irrespective of the variety's familiarity. The results suggest that this talker recognition effect rests not on simple familiarity, but on an abstract level of phonological processing
  • Kidd, E., Junge, C., Spokes, T., Morrison, L., & Cutler, A. (2018). Individual differences in infant speech segmentation: Achieving the lexical shift. Infancy, 23(6), 770-794. doi:10.1111/infa.12256.

    Abstract

    We report a large‐scale electrophysiological study of infant speech segmentation, in which over 100 English‐acquiring 9‐month‐olds were exposed to unfamiliar bisyllabic words embedded in sentences (e.g., He saw a wild eagle up there), after which their brain responses to either the just‐familiarized word (eagle) or a control word (coral) were recorded. When initial exposure occurs in continuous speech, as here, past studies have reported that even somewhat older infants do not reliably recognize target words, but that successful segmentation varies across children. Here, we both confirm and further uncover the nature of this variation. The segmentation response systematically varied across individuals and was related to their vocabulary development. About one‐third of the group showed a left‐frontally located relative negativity in response to familiar versus control targets, which has previously been described as a mature response. Another third showed a similarly located positive‐going reaction (a previously described immature response), and the remaining third formed an intermediate grouping that was primarily characterized by an initial response delay. A fine‐grained group‐level analysis suggested that a developmental shift to a lexical mode of processing occurs toward the end of the first year, with variation across individual infants in the exact timing of this shift.

    Additional information

    supporting information
  • Norris, D., McQueen, J. M., & Cutler, A. (2018). Commentary on “Interaction in spoken word recognition models". Frontiers in Psychology, 9: 1568. doi:10.3389/fpsyg.2018.01568.
  • Cutler, A. (2009). Greater sensitivity to prosodic goodness in non-native than in native listeners. Journal of the Acoustical Society of America, 125, 3522-3525. doi:10.1121/1.3117434.

    Abstract

    English listeners largely disregard suprasegmental cues to stress in recognizing words. Evidence for this includes the demonstration of Fear et al. [J. Acoust. Soc. Am. 97, 1893–1904 (1995)] that cross-splicings are tolerated between stressed and unstressed full vowels (e.g., au- of autumn, automata). Dutch listeners, however, do exploit suprasegmental stress cues in recognizing native-language words. In this study, Dutch listeners were presented with English materials from the study of Fear et al. Acceptability ratings by these listeners revealed sensitivity to suprasegmental mismatch, in particular, in replacements of unstressed full vowels by higher-stressed vowels, thus evincing greater sensitivity to prosodic goodness than had been shown by the original native listener group.
  • Cutler, A., Otake, T., & McQueen, J. M. (2009). Vowel devoicing and the perception of spoken Japanese words. Journal of the Acoustical Society of America, 125(3), 1693-1703. doi:10.1121/1.3075556.

    Abstract

    Three experiments, in which Japanese listeners detected Japanese words embedded in nonsense sequences, examined the perceptual consequences of vowel devoicing in that language. Since vowelless sequences disrupt speech segmentation [Norris et al. (1997). Cognit. Psychol. 34, 191– 243], devoicing is potentially problematic for perception. Words in initial position in nonsense sequences were detected more easily when followed by a sequence containing a vowel than by a vowelless segment (with or without further context), and vowelless segments that were potential devoicing environments were no easier than those not allowing devoicing. Thus asa, “morning,” was easier in asau or asazu than in all of asap, asapdo, asaf, or asafte, despite the fact that the /f/ in the latter two is a possible realization of fu, with devoiced [u]. Japanese listeners thus do not treat devoicing contexts as if they always contain vowels. Words in final position in nonsense sequences, however, produced a different pattern: here, preceding vowelless contexts allowing devoicing impeded word detection less strongly (so, sake was detected less accurately, but not less rapidly, in nyaksake—possibly arising from nyakusake—than in nyagusake). This is consistent with listeners treating consonant sequences as potential realizations of parts of existing lexical candidates wherever possible.
  • Kooijman, V., Hagoort, P., & Cutler, A. (2009). Prosodic structure in early word segmentation: ERP evidence from Dutch ten-month-olds. Infancy, 14, 591 -612. doi:10.1080/15250000903263957.

    Abstract

    Recognizing word boundaries in continuous speech requires detailed knowledge of the native language. In the first year of life, infants acquire considerable word segmentation abilities. Infants at this early stage in word segmentation rely to a large extent on the metrical pattern of their native language, at least in stress-based languages. In Dutch and English (both languages with a preferred trochaic stress pattern), segmentation of strong-weak words develops rapidly between 7 and 10 months of age. Nevertheless, trochaic languages contain not only strong-weak words but also words with a weak-strong stress pattern. In this article, we present electrophysiological evidence of the beginnings of weak-strong word segmentation in Dutch 10-month-olds. At this age, the ability to combine different cues for efficient word segmentation does not yet seem to be completely developed. We provide evidence that Dutch infants still largely rely on strong syllables, even for the segmentation of weak-strong words.
  • Tyler, M., & Cutler, A. (2009). Cross-language differences in cue use for speech segmentation. Journal of the Acoustical Society of America, 126, 367-376. doi:10.1121/1.3129127.

    Abstract

    Two artificial-language learning experiments directly compared English, French, and Dutch listeners’ use of suprasegmental cues for continuous-speech segmentation. In both experiments, listeners heard unbroken sequences of consonant-vowel syllables, composed of recurring three- and four-syllable “words.” These words were demarcated by(a) no cue other than transitional probabilities induced by their recurrence, (b) a consistent left-edge cue, or (c) a consistent right-edge cue. Experiment 1 examined a vowel lengthening cue. All three listener groups benefited from this cue in right-edge position; none benefited from it in left-edge position. Experiment 2 examined a pitch-movement cue. English listeners used this cue in left-edge position, French listeners used it in right-edge position, and Dutch listeners used it in both positions. These findings are interpreted as evidence of both language-universal and language-specific effects. Final lengthening is a language-universal effect expressing a more general (non-linguistic) mechanism. Pitch movement expresses prominence which has characteristically different placements across languages: typically at right edges in French, but at left edges in English and Dutch. Finally, stress realization in English versus Dutch encourages greater attention to suprasegmental variation by Dutch than by English listeners, allowing Dutch listeners to benefit from an informative pitch-movement cue even in an uncharacteristic position.
  • Cutler, A., & Foss, D. (1977). On the role of sentence stress in sentence processing. Language and Speech, 20, 1-10.
  • Fay, D., & Cutler, A. (1977). Malapropisms and the structure of the mental lexicon. Linguistic Inquiry, 8, 505-520. Retrieved from http://www.jstor.org/stable/4177997.

Share this page