Anne Cutler

Publications

Displaying 1 - 21 of 21
  • Bruggeman, L., & Cutler, A. (2019). The dynamics of lexical activation and competition in bilinguals’ first versus second language. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 20195) (pp. 1342-1346). Canberra, Australia: Australasian Speech Science and Technology Association Inc.

    Abstract

    Speech input causes listeners to activate multiple candidate words which then compete with one another. These include onset competitors, that share a beginning (bumper, butter), but also, counterintuitively, rhyme competitors, sharing an ending (bumper, jumper). In L1, competition is typically stronger for onset than for rhyme. In L2, onset competition has been attested but rhyme competition has heretofore remained largely unexamined. We assessed L1 (Dutch) and L2 (English) word recognition by the same late-bilingual individuals. In each language, eye gaze was recorded as listeners heard sentences and viewed sets of drawings: three unrelated, one depicting an onset or rhyme competitor of a word in the input. Activation patterns revealed substantial onset competition but no significant rhyme competition in either L1 or L2. Rhyme competition may thus be a “luxury” feature of maximally efficient listening, to be abandoned when resources are scarcer, as in listening by late bilinguals, in either language.
  • Cutler, A., Burchfield, A., & Antoniou, M. (2019). A criterial interlocutor tally for successful talker adaptation? In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 20195) (pp. 1485-1489). Canberra, Australia: Australasian Speech Science and Technology Association Inc.

    Abstract

    Part of the remarkable efficiency of listening is accommodation to unfamiliar talkers’ specific pronunciations by retuning of phonemic intercategory boundaries. Such retuning occurs in second (L2) as well as first language (L1); however, recent research with emigrés revealed successful adaptation in the environmental L2 but, unprecedentedly, not in L1 despite continuing L1 use. A possible explanation involving relative exposure to novel talkers is here tested in heritage language users with Mandarin as family L1 and English as environmental language. In English, exposure to an ambiguous sound in disambiguating word contexts prompted the expected adjustment of phonemic boundaries in subsequent categorisation. However, no adjustment occurred in Mandarin, again despite regular use. Participants reported highly asymmetric interlocutor counts in the two languages. We conclude that successful retuning ability requires regular exposure to novel talkers in the language in question, a criterion not met for the emigrés’ or for these heritage users’ L1.
  • Joo, H., Jang, J., Kim, S., Cho, T., & Cutler, A. (2019). Prosodic structural effects on coarticulatory vowel nasalization in Australian English in comparison to American English. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 20195) (pp. 835-839). Canberra, Australia: Australasian Speech Science and Technology Association Inc.

    Abstract

    This study investigates effects of prosodic factors (prominence, boundary) on coarticulatory Vnasalization in Australian English (AusE) in CVN and NVC in comparison to those in American English (AmE). As in AmE, prominence was found to lengthen N, but to reduce V-nasalization, enhancing N’s nasality and V’s orality, respectively (paradigmatic contrast enhancement). But the prominence effect in CVN was more robust than that in AmE. Again similar to findings in AmE, boundary induced a reduction of N-duration and V-nasalization phrase-initially (syntagmatic contrast enhancement), and increased the nasality of both C and V phrasefinally. But AusE showed some differences in terms of the magnitude of V nasalization and N duration. The results suggest that the linguistic contrast enhancements underlie prosodic-structure modulation of coarticulatory V-nasalization in comparable ways across dialects, while the fine phonetic detail indicates that the phonetics-prosody interplay is internalized in the individual dialect’s phonetic grammar.
  • Kember, H., Choi, J., Yu, J., & Cutler, A. (2019). The processing of linguistic prominence. Language and Speech. Advance online publication. doi:10.1177/0023830919880217.

    Abstract

    Prominence, the expression of informational weight within utterances, can be signaled by prosodic highlighting (head-prominence, as in English) or by position (as in Korean edge-prominence). Prominence confers processing advantages, even if conveyed only by discourse manipulations. Here we compared processing of prominence in English and Korean, using a task that indexes processing success, namely recognition memory. In each language, participants’ memory was tested for target words heard in sentences in which they were prominent due to prosody, position, both or neither. Prominence produced recall advantage, but the relative effects differed across language. For Korean listeners the positional advantage was greater, but for English listeners prosodic and syntactic prominence had equivalent and additive effects. In a further experiment semantic and phonological foils tested depth of processing of the recall targets. Both foil types were correctly rejected, suggesting that semantic processing had not reached the level at which word form was no longer available. Together the results suggest that prominence processing is primarily driven by universal effects of information structure; but language-specific differences in frequency of experience prompt different relative advantages of prominence signal types. Processing efficiency increases in each case, however, creating more accurate and more rapidly contactable memory representations.
  • Nazzi, T., & Cutler, A. (2019). How consonants and vowels shape spoken-language recognition. Annual Review of Linguistics, 5, 25-47. doi:10.1146/annurev-linguistics-011718-011919.

    Abstract

    All languages instantiate a consonant/vowel contrast. This contrast has processing consequences at different levels of spoken-language recognition throughout the lifespan. In adulthood, lexical processing is more strongly associated with consonant than with vowel processing; this has been demonstrated across 13 languages from seven language families and in a variety of auditory lexical-level tasks (deciding whether a spoken input is a word, spotting a real word embedded in a minimal context, reconstructing a word minimally altered into a pseudoword, learning new words or the “words” of a made-up language), as well as in written-word tasks involving phonological processing. In infancy, a consonant advantage in word learning and recognition is found to emerge during development in some languages, though possibly not in others, revealing that the stronger lexicon–consonant association found in adulthood is learned. Current research is evaluating the relative contribution of the early acquisition of the acoustic/phonetic and lexical properties of the native language in the emergence of this association
  • Burnham, D., Ambikairajah, E., Arciuli, J., Bennamoun, M., Best, C. T., Bird, S., Butcher, A. R., Cassidy, S., Chetty, G., Cox, F. M., Cutler, A., Dale, R., Epps, J. R., Fletcher, J. M., Goecke, R., Grayden, D. B., Hajek, J. T., Ingram, J. C., Ishihara, S., Kemp, N. and 10 moreBurnham, D., Ambikairajah, E., Arciuli, J., Bennamoun, M., Best, C. T., Bird, S., Butcher, A. R., Cassidy, S., Chetty, G., Cox, F. M., Cutler, A., Dale, R., Epps, J. R., Fletcher, J. M., Goecke, R., Grayden, D. B., Hajek, J. T., Ingram, J. C., Ishihara, S., Kemp, N., Kinoshita, Y., Kuratate, T., Lewis, T. W., Loakes, D. E., Onslow, M., Powers, D. M., Rose, P., Togneri, R., Tran, D., & Wagner, M. (2009). A blueprint for a comprehensive Australian English auditory-visual speech corpus. In M. Haugh, K. Burridge, J. Mulder, & P. Peters (Eds.), Selected proceedings of the 2008 HCSNet Workshop on Designing the Australian National Corpus (pp. 96-107). Somerville, MA: Cascadilla Proceedings Project.

    Abstract

    Large auditory-visual (AV) speech corpora are the grist of modern research in speech science, but no such corpus exists for Australian English. This is unfortunate, for speech science is the brains behind speech technology and applications such as text-to-speech (TTS) synthesis, automatic speech recognition (ASR), speaker recognition and forensic identification, talking heads, and hearing prostheses. Advances in these research areas in Australia require a large corpus of Australian English. Here the authors describe a blueprint for building the Big Australian Speech Corpus (the Big ASC), a corpus of over 1,100 speakers from urban and rural Australia, including speakers of non-indigenous, indigenous, ethnocultural, and disordered forms of Australian English, each of whom would be sampled on three occasions in a range of speech tasks designed by the researchers who would be using the corpus.
  • Cutler, A. (2009). Greater sensitivity to prosodic goodness in non-native than in native listeners. Journal of the Acoustical Society of America, 125, 3522-3525. doi:10.1121/1.3117434.

    Abstract

    English listeners largely disregard suprasegmental cues to stress in recognizing words. Evidence for this includes the demonstration of Fear et al. [J. Acoust. Soc. Am. 97, 1893–1904 (1995)] that cross-splicings are tolerated between stressed and unstressed full vowels (e.g., au- of autumn, automata). Dutch listeners, however, do exploit suprasegmental stress cues in recognizing native-language words. In this study, Dutch listeners were presented with English materials from the study of Fear et al. Acceptability ratings by these listeners revealed sensitivity to suprasegmental mismatch, in particular, in replacements of unstressed full vowels by higher-stressed vowels, thus evincing greater sensitivity to prosodic goodness than had been shown by the original native listener group.
  • Cutler, A., Davis, C., & Kim, J. (2009). Non-automaticity of use of orthographic knowledge in phoneme evaluation. In Proceedings of the 10th Annual Conference of the International Speech Communication Association (Interspeech 2009) (pp. 380-383). Causal Productions Pty Ltd.

    Abstract

    Two phoneme goodness rating experiments addressed the role of orthographic knowledge in the evaluation of speech sounds. Ratings for the best tokens of /s/ were higher in words spelled with S (e.g., bless) than in words where /s/ was spelled with C (e.g., voice). This difference did not appear for analogous nonwords for which every lexical neighbour had either S or C spelling (pless, floice). Models of phonemic processing incorporating obligatory influence of lexical information in phonemic processing cannot explain this dissociation; the data are consistent with models in which phonemic decisions are not subject to necessary top-down lexical influence.
  • Cutler, A., Otake, T., & McQueen, J. M. (2009). Vowel devoicing and the perception of spoken Japanese words. Journal of the Acoustical Society of America, 125(3), 1693-1703. doi:10.1121/1.3075556.

    Abstract

    Three experiments, in which Japanese listeners detected Japanese words embedded in nonsense sequences, examined the perceptual consequences of vowel devoicing in that language. Since vowelless sequences disrupt speech segmentation [Norris et al. (1997). Cognit. Psychol. 34, 191– 243], devoicing is potentially problematic for perception. Words in initial position in nonsense sequences were detected more easily when followed by a sequence containing a vowel than by a vowelless segment (with or without further context), and vowelless segments that were potential devoicing environments were no easier than those not allowing devoicing. Thus asa, “morning,” was easier in asau or asazu than in all of asap, asapdo, asaf, or asafte, despite the fact that the /f/ in the latter two is a possible realization of fu, with devoiced [u]. Japanese listeners thus do not treat devoicing contexts as if they always contain vowels. Words in final position in nonsense sequences, however, produced a different pattern: here, preceding vowelless contexts allowing devoicing impeded word detection less strongly (so, sake was detected less accurately, but not less rapidly, in nyaksake—possibly arising from nyakusake—than in nyagusake). This is consistent with listeners treating consonant sequences as potential realizations of parts of existing lexical candidates wherever possible.
  • Kooijman, V., Hagoort, P., & Cutler, A. (2009). Prosodic structure in early word segmentation: ERP evidence from Dutch ten-month-olds. Infancy, 14, 591 -612. doi:10.1080/15250000903263957.

    Abstract

    Recognizing word boundaries in continuous speech requires detailed knowledge of the native language. In the first year of life, infants acquire considerable word segmentation abilities. Infants at this early stage in word segmentation rely to a large extent on the metrical pattern of their native language, at least in stress-based languages. In Dutch and English (both languages with a preferred trochaic stress pattern), segmentation of strong-weak words develops rapidly between 7 and 10 months of age. Nevertheless, trochaic languages contain not only strong-weak words but also words with a weak-strong stress pattern. In this article, we present electrophysiological evidence of the beginnings of weak-strong word segmentation in Dutch 10-month-olds. At this age, the ability to combine different cues for efficient word segmentation does not yet seem to be completely developed. We provide evidence that Dutch infants still largely rely on strong syllables, even for the segmentation of weak-strong words.
  • Tyler, M., & Cutler, A. (2009). Cross-language differences in cue use for speech segmentation. Journal of the Acoustical Society of America, 126, 367-376. doi:10.1121/1.3129127.

    Abstract

    Two artificial-language learning experiments directly compared English, French, and Dutch listeners’ use of suprasegmental cues for continuous-speech segmentation. In both experiments, listeners heard unbroken sequences of consonant-vowel syllables, composed of recurring three- and four-syllable “words.” These words were demarcated by(a) no cue other than transitional probabilities induced by their recurrence, (b) a consistent left-edge cue, or (c) a consistent right-edge cue. Experiment 1 examined a vowel lengthening cue. All three listener groups benefited from this cue in right-edge position; none benefited from it in left-edge position. Experiment 2 examined a pitch-movement cue. English listeners used this cue in left-edge position, French listeners used it in right-edge position, and Dutch listeners used it in both positions. These findings are interpreted as evidence of both language-universal and language-specific effects. Final lengthening is a language-universal effect expressing a more general (non-linguistic) mechanism. Pitch movement expresses prominence which has characteristically different placements across languages: typically at right edges in French, but at left edges in English and Dutch. Finally, stress realization in English versus Dutch encourages greater attention to suprasegmental variation by Dutch than by English listeners, allowing Dutch listeners to benefit from an informative pitch-movement cue even in an uncharacteristic position.
  • Akker, E., & Cutler, A. (2003). Prosodic cues to semantic structure in native and nonnative listening. Bilingualism: Language and Cognition, 6(2), 81-96. doi:10.1017/S1366728903001056.

    Abstract

    Listeners efficiently exploit sentence prosody to direct attention to words bearing sentence accent. This effect has been explained as a search for focus, furthering rapid apprehension of semantic structure. A first experiment supported this explanation: English listeners detected phoneme targets in sentences more rapidly when the target-bearing words were in accented position or in focussed position, but the two effects interacted, consistent with the claim that the effects serve a common cause. In a second experiment a similar asymmetry was observed with Dutch listeners and Dutch sentences. In a third and a fourth experiment, proficient Dutch users of English heard English sentences; here, however, the two effects did not interact. The results suggest that less efficient mapping of prosody to semantics may be one way in which nonnative listening fails to equal native listening.
  • Cutler, A., Murty, L., & Otake, T. (2003). Rhythmic similarity effects in non-native listening? In Proceedings of the 15th International Congress of Phonetic Sciences (PCPhS 2003) (pp. 329-332). Adelaide: Causal Productions.

    Abstract

    Listeners rely on native-language rhythm in segmenting speech; in different languages, stress-, syllable- or mora-based rhythm is exploited. This language-specificity affects listening to non- native speech, if native procedures are applied even though inefficient for the non-native language. However, speakers of two languages with similar rhythmic interpretation should segment their own and the other language similarly. This was observed to date only for related languages (English-Dutch; French-Spanish). We now report experiments in which Japanese listeners heard Telugu, a Dravidian language unrelated to Japanese, and Telugu listeners heard Japanese. In both cases detection of target sequences in speech was harder when target boundaries mismatched mora boundaries, exactly the pattern that Japanese listeners earlier exhibited with Japanese and other languages. These results suggest that Telugu and Japanese listeners use similar procedures in segmenting speech, and support the idea that languages fall into rhythmic classes, with aspects of phonological structure affecting listeners' speech segmentation.
  • Johnson, E. K., Jusczyk, P. W., Cutler, A., & Norris, D. (2003). Lexical viability constraints on speech segmentation by infants. Cognitive Psychology, 46(1), 65-97. doi:10.1016/S0010-0285(02)00507-8.

    Abstract

    The Possible Word Constraint limits the number of lexical candidates considered in speech recognition by stipulating that input should be parsed into a string of lexically viable chunks. For instance, an isolated single consonant is not a feasible word candidate. Any segmentation containing such a chunk is disfavored. Five experiments using the head-turn preference procedure investigated whether, like adults, 12-month-olds observe this constraint in word recognition. In Experiments 1 and 2, infants were familiarized with target words (e.g., rush), then tested on lists of nonsense items containing these words in “possible” (e.g., “niprush” [nip + rush]) or “impossible” positions (e.g., “prush” [p + rush]). The infants listened significantly longer to targets in “possible” versus “impossible” contexts when targets occurred at the end of nonsense items (rush in “prush”), but not when they occurred at the beginning (tan in “tance”). In Experiments 3 and 4, 12-month-olds were similarly familiarized with target words, but test items were real words in sentential contexts (win in “wind” versus “window”). The infants listened significantly longer to words in the “possible” condition regardless of target location. Experiment 5 with targets at the beginning of isolated real words (e.g., win in “wind”) replicated Experiment 2 in showing no evidence of viability effects in beginning position. Taken together, the findings suggest that, in situations in which 12-month-olds are required to rely on their word segmentation abilities, they give evidence of observing lexical viability constraints in the way that they parse fluent speech.
  • McQueen, J. M., Cutler, A., & Norris, D. (2003). Flow of information in the spoken word recognition system. Speech Communication, 41(1), 257-270. doi:10.1016/S0167-6393(02)00108-5.

    Abstract

    Spoken word recognition consists of two major component processes. First, at the prelexical stage, an abstract description of the utterance is generated from the information in the speech signal. Second, at the lexical stage, this description is used to activate all the words stored in the mental lexicon which match the input. These multiple candidate words then compete with each other. We review evidence which suggests that positive (match) and negative (mismatch) information of both a segmental and a suprasegmental nature is used to constrain this activation and competition process. We then ask whether, in addition to the necessary influence of the prelexical stage on the lexical stage, there is also feedback from the lexicon to the prelexical level. In two phonetic categorization experiments, Dutch listeners were asked to label both syllable-initial and syllable-final ambiguous fricatives (e.g., sounds ranging from [f] to [s]) in the word–nonword series maf–mas, and the nonword–word series jaf–jas. They tended to label the sounds in a lexically consistent manner (i.e., consistent with the word endpoints of the series). These lexical effects became smaller in listeners’ slower responses, even when the listeners were put under pressure to respond as fast as possible. Our results challenge models of spoken word recognition in which feedback modulates the prelexical analysis of the component sounds of a word whenever that word is heard
  • Norris, D., McQueen, J. M., & Cutler, A. (2003). Perceptual learning in speech. Cognitive Psychology, 47(2), 204-238. doi:10.1016/S0010-0285(03)00006-9.

    Abstract

    This study demonstrates that listeners use lexical knowledge in perceptual learning of speech sounds. Dutch listeners first made lexical decisions on Dutch words and nonwords. The final fricative of 20 critical words had been replaced by an ambiguous sound, between [f] and [s]. One group of listeners heard ambiguous [f]-final words (e.g., [WI tlo?], from witlof, chicory) and unambiguous [s]-final words (e.g., naaldbos, pine forest). Another group heard the reverse (e.g., ambiguous [na:ldbo?], unambiguous witlof). Listeners who had heard [?] in [f]-final words were subsequently more likely to categorize ambiguous sounds on an [f]–[s] continuum as [f] than those who heard [?] in [s]-final words. Control conditions ruled out alternative explanations based on selective adaptation and contrast. Lexical information can thus be used to train categorization of speech. This use of lexical information differs from the on-line lexical feedback embodied in interactive models of speech perception. In contrast to on-line feedback, lexical feedback for learning is of benefit to spoken word recognition (e.g., in adapting to a newly encountered dialect).
  • Shi, R., Werker, J., & Cutler, A. (2003). Function words in early speech perception. In Proceedings of the 15th International Congress of Phonetic Sciences (pp. 3009-3012).

    Abstract

    Three experiments examined whether infants recognise functors in phrases, and whether their representations of functors are phonetically well specified. Eight- and 13- month-old English infants heard monosyllabic lexical words preceded by real functors (e.g., the, his) versus nonsense functors (e.g., kuh); the latter were minimally modified segmentally (but not prosodically) from real functors. Lexical words were constant across conditions; thus recognition of functors would appear as longer listening time to sequences with real functors. Eightmonth- olds' listening times to sequences with real versus nonsense functors did not significantly differ, suggesting that they did not recognise real functors, or functor representations lacked phonetic specification. However, 13-month-olds listened significantly longer to sequences with real functors. Thus, somewhere between 8 and 13 months of age infants learn familiar functors and represent them with segmental detail. We propose that accumulated frequency of functors in input in general passes a critical threshold during this time.
  • Smits, R., Warner, N., McQueen, J. M., & Cutler, A. (2003). Unfolding of phonetic information over time: A database of Dutch diphone perception. Journal of the Acoustical Society of America, 113(1), 563-574. doi:10.1121/1.1525287.

    Abstract

    We present the results of a large-scale study on speech perception, assessing the number and type of perceptual hypotheses which listeners entertain about possible phoneme sequences in their language. Dutch listeners were asked to identify gated fragments of all 1179 diphones of Dutch, providing a total of 488 520 phoneme categorizations. The results manifest orderly uptake of acoustic information in the signal. Differences across phonemes in the rate at which fully correct recognition was achieved arose as a result of whether or not potential confusions could occur with other phonemes of the language ~long with short vowels, affricates with their initial components, etc.!. These data can be used to improve models of how acoustic phonetic information is mapped onto the mental lexicon during speech comprehension.
  • Spinelli, E., McQueen, J. M., & Cutler, A. (2003). Processing resyllabified words in French. Journal of Memory and Language, 48(2), 233-254. doi:10.1016/S0749-596X(02)00513-2.
  • Weber, A., & Cutler, A. (2003). Perceptual similarity co-existing with lexical dissimilarity [Abstract]. Abstracts of the 146th Meeting of the Acoustical Society of America. Journal of the Acoustical Society of America, 114(4 Pt. 2), 2422. doi:10.1121/1.1601094.

    Abstract

    The extreme case of perceptual similarity is indiscriminability, as when two second‐language phonemes map to a single native category. An example is the English had‐head vowel contrast for Dutch listeners; Dutch has just one such central vowel, transcribed [E]. We examine whether the failure to discriminate in phonetic categorization implies indiscriminability in other—e.g., lexical—processing. Eyetracking experiments show that Dutch‐native listeners instructed in English to ‘‘click on the panda’’ look (significantly more than native listeners) at a pictured pencil, suggesting that pan‐ activates their lexical representation of pencil. The reverse, however, is not the case: ‘‘click on the pencil’’ does not induce looks to a panda, suggesting that pen‐ does not activate panda in the lexicon. Thus prelexically undiscriminated second‐language distinctions can nevertheless be maintained in stored lexical representations. The problem of mapping a resulting unitary input to two distinct categories in lexical representations is solved by allowing input to activate only one second‐language category. For Dutch listeners to English, this is English [E], as a result of which no vowels in the signal ever map to words containing [ae]. We suggest that the choice of category is here motivated by a more abstract, phonemic, metric of similarity.
  • Cutler, A. (1970). An experimental method for semantic field study. Linguistic Communications, 2, 87-94.

    Abstract

    This paper emphasizes the need for empirical research and objective discovery procedures in semantics, and illustrates a method by which these goals may be obtained. The aim of the methodology described is to provide a description of the internal structure of a semantic field by eliciting the description--in an objective, standardized manner--from a representative group of native speakers. This would produce results that would be equally obtainable by any linguist using the same method under the same conditions with a similarly representative set of informants. The standardized method suggested by the author is the Semantic Differential developed by C. E. Osgood in the 1950's. Applying this method to semantic research, it is further hypothesized that, should different members of a semantic field be employed as concepts on a Semantic Differential task, a factor analysis of the results would reveal the dimensions operative within the body of data. The author demonstrates the use of the Semantic Differential and factor analysis in an actual experiment.

Share this page