Anne Cutler

Publications

Displaying 1 - 29 of 29
  • Burchfield, L. A., Luk, S.-.-H.-K., Antoniou, M., & Cutler, A. (2017). Lexically guided perceptual learning in Mandarin Chinese. In Proceedings of Interspeech 2017 (pp. 576-580). doi:10.21437/Interspeech.2017-618.

    Abstract

    Lexically guided perceptual learni ng refers to the use of lexical knowledge to retune sp eech categories and thereby adapt to a novel talker’s pronunciation. This adaptation has been extensively documented, but primarily for segmental-based learning in English and Dutch. In languages with lexical tone, such as Mandarin Chinese, tonal categories can also be retuned in this way, but segmental category retuning had not been studied. We report two experiment s in which Mandarin Chinese listeners were exposed to an ambiguous mixture of [f] and [s] in lexical contexts favoring an interpretation as either [f] or [s]. Listeners were subsequently more likely to identify sounds along a continuum between [f] and [s], and to interpret minimal word pairs, in a manner consistent with this exposure. Thus lexically guided perceptual learning of segmental categories had indeed taken place, consistent with suggestions that such learning may be a universally available adaptation process
  • Choi, J., Cutler, A., & Broersma, M. (2017). Early development of abstract language knowledge: Evidence from perception-production transfer of birth-language memory. Royal Society Open Science, 4: 160660. doi:10.1098/rsos.160660.

    Abstract

    Children adopted early in life into another linguistic community typically forget their birth language but retain, unaware, relevant linguistic knowledge that may facilitate (re)learning of birth-language patterns. Understanding the nature of this knowledge can shed light on how language is acquired. Here, international adoptees from Korea with Dutch as their current language, and matched Dutch-native controls, provided speech production data on a Korean consonantal distinction unlike any Dutch distinctions, at the outset and end of an intensive perceptual training. The productions, elicited in a repetition task, were identified and rated by Korean listeners. Adoptees' production scores improved significantly more across the training period than control participants' scores, and, for adoptees only, relative production success correlated significantly with the rate of learning in perception (which had, as predicted, also surpassed that of the controls). Of the adoptee group, half had been adopted at 17 months or older (when talking would have begun), while half had been prelinguistic (under six months). The former group, with production experience, showed no advantage over the group without. Thus the adoptees' retained knowledge of Korean transferred from perception to production and appears to be abstract in nature rather than dependent on the amount of experience.
  • Choi, J., Broersma, M., & Cutler, A. (2017). Early phonology revealed by international adoptees' birth language retention. Proceedings of the National Academy of Sciences of the United States of America, 114(28), 7307-7312. doi:10.1073/pnas.1706405114.

    Abstract

    Until at least 6 mo of age, infants show good discrimination for familiar phonetic contrasts (i.e., those heard in the environmental language) and contrasts that are unfamiliar. Adult-like discrimination (significantly worse for nonnative than for native contrasts) appears only later, by 9–10 mo. This has been interpreted as indicating that infants have no knowledge of phonology until vocabulary development begins, after 6 mo of age. Recently, however, word recognition has been observed before age 6 mo, apparently decoupling the vocabulary and phonology acquisition processes. Here we show that phonological acquisition is also in progress before 6 mo of age. The evidence comes from retention of birth-language knowledge in international adoptees. In the largest ever such study, we recruited 29 adult Dutch speakers who had been adopted from Korea when young and had no conscious knowledge of Korean language at all. Half were adopted at age 3–5 mo (before native-specific discrimination develops) and half at 17 mo or older (after word learning has begun). In a short intensive training program, we observe that adoptees (compared with 29 matched controls) more rapidly learn tripartite Korean consonant distinctions without counterparts in their later-acquired Dutch, suggesting that the adoptees retained phonological knowledge about the Korean distinction. The advantage is equivalent for the younger-adopted and the older-adopted groups, and both groups not only acquire the tripartite distinction for the trained consonants but also generalize it to untrained consonants. Although infants younger than 6 mo can still discriminate unfamiliar phonetic distinctions, this finding indicates that native-language phonological knowledge is nonetheless being acquired at that age.
  • Cutler, A. (2017). Converging evidence for abstract phonological knowledge in speech processing. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 1447-1448). Austin, TX: Cognitive Science Society.

    Abstract

    The perceptual processing of speech is a constant interplay of multiple competing albeit convergent processes: acoustic input vs. higher-level representations, universal mechanisms vs. language-specific, veridical traces of speech experience vs. construction and activation of abstract representations. The present summary concerns the third of these issues. The ability to generalise across experience and to deal with resulting abstractions is the hallmark of human cognition, visible even in early infancy. In speech processing, abstract representations play a necessary role in both production and perception. New sorts of evidence are now informing our understanding of the breadth of this role.
  • Ip, M. H. K., & Cutler, A. (2017). Intonation facilitates prediction of focus even in the presence of lexical tones. In Proceedings of Interspeech 2017 (pp. 1218-1222). doi:10.21437/Interspeech.2017-264.

    Abstract

    In English and Dutch, listeners entrain to prosodic contours to predict where focus will fall in an utterance. However, is this strategy universally available, even in languages with different phonological systems? In a phoneme detection experiment, we examined whether prosodic entrainment is also found in Mandarin Chinese, a tone language, where in principle the use of pitch for lexical identity may take precedence over the use of pitch cues to salience. Consistent with the results from Germanic languages, response times were facilitated when preceding intonation predicted accent on the target-bearing word. Acoustic analyses revealed greater F0 range in the preceding intonation of the predicted-accent sentences. These findings have implications for how universal and language-specific mechanisms interact in the processing of salience.
  • Goudbeek, M., Smits, R., Cutler, A., & Swingley, D. (2017). Auditory and phonetic category formation. In H. Cohen, & C. Lefebvre (Eds.), Handbook of categorization in cognitive science (2nd revised ed.) (pp. 687-708). Amsterdam: Elsevier.
  • Kember, H., Grohe, A.-.-K., Zahner, K., Braun, B., Weber, A., & Cutler, A. (2017). Similar prosodic structure perceived differently in German and English. In Proceedings of Interspeech 2017 (pp. 1388-1392).

    Abstract

    English and German have similar prosody, but their speakers realize some pitch falls (not rises) in subtly different ways. We here test for asymmetry in perception. An ABX discrimination task requiring F0 slope or duration judgements on isolated vowels revealed no cross-language difference in duration or F0 fall discrimination, but discrimination of rises (realized similarly in each language) was less accurate for English than for German listeners. This unexpected finding may reflect greater sensitivity to rising patterns by German listeners, or reduced sensitivity by English listeners as a result of extensive exposure to phrase-final rises (“uptalk”) in their language
  • Warner, N., & Cutler, A. (2017). Stress effects in vowel perception as a function of language-specific vocabulary patterns. Phonetica, 74, 81-106. doi:10.1159/000447428.

    Abstract

    Background/Aims: Evidence from spoken word recognition suggests that for English listeners, distinguishing full versus reduced vowels is important, but discerning stress differences involving the same full vowel (as in mu- from music or museum) is not. In Dutch, in contrast, the latter distinction is important. This difference arises from the relative frequency of unstressed full vowels in the two vocabularies. The goal of this paper is to determine how this difference in the lexicon influences the perception of stressed versus unstressed vowels. Methods: All possible sequences of two segments (diphones) in Dutch and in English were presented to native listeners in gated fragments. We recorded identification performance over time throughout the speech signal. The data were here analysed specifically for patterns in perception of stressed versus unstressed vowels. Results: The data reveal significantly larger stress effects (whereby unstressed vowels are harder to identify than stressed vowels) in English than in Dutch. Both language-specific and shared patterns appear regarding which vowels show stress effects. Conclusion: We explain the larger stress effect in English as reflecting the processing demands caused by the difference in use of unstressed vowels in the lexicon. The larger stress effect in English is due to relative inexperience with processing unstressed full vowels
  • Bruggeman, L., & Cutler, A. (2016). Lexical manipulation as a discovery tool for psycholinguistic research. In C. Carignan, & M. D. Tyler (Eds.), Proceedings of the 16th Australasian International Conference on Speech Science and Technology (SST2016) (pp. 313-316).
  • Cutler, A., & Norris, D. (2016). Bottoms up! How top-down pitfalls ensnare speech perception researchers too. Commentary on C. Firestone & B. Scholl: Cognition does not affect perception: Evaluating the evidence for 'top-down' effects. Behavioral and Brain Sciences, e236. doi:10.1017/S0140525X15002745.

    Abstract

    Not only can the pitfalls that Firestone & Scholl (F&S) identify be generalised across multiple studies within the field of visual perception, but also they have general application outside the field wherever perceptual and cognitive processing are compared. We call attention to the widespread susceptibility of research on the perception of speech to versions of the same pitfalls.
  • Ip, M., & Cutler, A. (2016). Cross-language data on five types of prosodic focus. In J. Barnes, A. Brugos, S. Shattuck-Hufnagel, & N. Veilleux (Eds.), Proceedings of Speech Prosody 2016 (pp. 330-334).

    Abstract

    To examine the relative roles of language-specific and language-universal mechanisms in the production of prosodic focus, we compared production of five different types of focus by native speakers of English and Mandarin. Two comparable dialogues were constructed for each language, with the same words appearing in focused and unfocused position; 24 speakers recorded each dialogue in each language. Duration, F0 (mean, maximum, range), and rms-intensity (mean, maximum) of all critical word tokens were measured. Across the different types of focus, cross-language differences were observed in the degree to which English versus Mandarin speakers use the different prosodic parameters to mark focus, suggesting that while prosody may be universally available for expressing focus, the means of its employment may be considerably language-specific
  • Jeske, J., Kember, H., & Cutler, A. (2016). Native and non-native English speakers' use of prosody to predict sentence endings. In Proceedings of the 16th Australasian International Conference on Speech Science and Technology (SST2016).
  • Kember, H., Choi, J., & Cutler, A. (2016). Processing advantages for focused words in Korean. In J. Barnes, A. Brugos, S. Shattuck-Hufnagel, & N. Veilleux (Eds.), Proceedings of Speech Prosody 2016 (pp. 702-705).

    Abstract

    In Korean, focus is expressed in accentual phrasing. To ascertain whether words focused in this manner enjoy a processing advantage analogous to that conferred by focus as expressed in, e.g, English and Dutch, we devised sentences with target words in one of four conditions: prosodic focus, syntactic focus, prosodic + syntactic focus, and no focus as a control. 32 native speakers of Korean listened to blocks of 10 sentences, then were presented visually with words and asked whether or not they had heard them. Overall, words with focus were recognised significantly faster and more accurately than unfocused words. In addition, words with syntactic focus or syntactic + prosodic focus were recognised faster than words with prosodic focus alone. As for other languages, Korean focus confers processing advantage on the words carrying it. While prosodic focus does provide an advantage, however, syntactic focus appears to provide the greater beneficial effect for recognition memory
  • Norris, D., McQueen, J. M., & Cutler, A. (2016). Prediction, Bayesian inference and feedback in speech recognition. Language, Cognition and Neuroscience, 31(1), 4-18. doi:10.1080/23273798.2015.1081703.

    Abstract

    Speech perception involves prediction, but how is that prediction implemented? In cognitive models prediction has often been taken to imply that there is feedback of activation from lexical to pre-lexical processes as implemented in interactive-activation models (IAMs). We show that simple activation feedback does not actually improve speech recognition. However, other forms of feedback can be beneficial. In particular, feedback can enable the listener to adapt to changing input, and can potentially help the listener to recognise unusual input, or recognise speech in the presence of competing sounds. The common feature of these helpful forms of feedback is that they are all ways of optimising the performance of speech recognition using Bayesian inference. That is, listeners make predictions about speech because speech recognition is optimal in the sense captured in Bayesian models.
  • Boland, J. E., & Cutler, A. (1995). Interaction with autonomy: Defining multiple output models in psycholinguistic theory. Working Papers in Linguistic, 45, 1-10. Retrieved from http://hdl.handle.net/2066/15768.

    Abstract

    There are currently a number of psycholinguistic models in which processing at a particular level of representation is characterized by the generation of multiple outputs, with resolution involving the use of information from higher levels of processing. Surprisingly, models with this architecture have been characterized as autonomous within the domain of word recognition and as interactive within the domain of sentence processing. We suggest that the apparent internal confusion is not, as might be assumed, due to fundamental differences between lexical and syntactic processing. Rather, we believe that the labels in each domain were chosen in order to obtain maximal contrast between a new model and the model or models that were currently dominating the field.
  • Boland, J. E., & Cutler, A. (1995). Interaction with autonomy: Multiple Output models and the inadequacy of the Great Divide. Cognition, 58, 309-320. doi:10.1016/0010-0277(95)00684-2.

    Abstract

    There are currently a number of psycholinguistic models in which processing at a particular level of representation is characterized by the generation of multiple outputs, with resolution - but not generation - involving the use of information from higher levels of processing. Surprisingly, models with this architecture have been characterized as autonomous within the domain of word recognition but as interactive within the domain of sentence processing. We suggest that the apparent confusion is not, as might be assumed, due to fundamental differences between lexical and syntactic processing. Rather, we believe that the labels in each domain were chosen in order to obtain maximal contrast between a new model and the model or models that were currently dominating the field. The contradiction serves to highlight the inadequacy of a simple autonomy/interaction dichotomy for characterizing the architectures of current processing models.
  • Cutler, A., & Chen, H.-C. (1995). Phonological similarity effects in Cantonese word recognition. In K. Elenius, & P. Branderud (Eds.), Proceedings of the Thirteenth International Congress of Phonetic Sciences: Vol. 1 (pp. 106-109). Stockholm: Stockholm University.

    Abstract

    Two lexical decision experiments in Cantonese are described in which the recognition of spoken target words as a function of phonological similarity to a preceding prime is investigated. Phonological similaritv in first syllables produced inhibition, while similarity in second syllables led to facilitation. Differences between syllables in tonal and segmental structure had generally similar effects.
  • Cutler, A. (1995). Spoken word recognition and production. In J. L. Miller, & P. D. Eimas (Eds.), Speech, language and communication (pp. 97-136). New York: Academic Press.

    Abstract

    This chapter highlights that most language behavior consists of speaking and listening. The chapter also reveals differences and similarities between speaking and listening. The laboratory study of word production raises formidable problems; ensuring that a particular word is produced may subvert the spontaneous production process. Word production is investigated via slips and tip-of-the-tongue (TOT), primarily via instances of processing failure and via the technique of via the picture-naming task. The methodology of word production is explained in the chapter. The chapter also explains the phenomenon of interaction between various stages of word production and the process of speech recognition. In this context, it explores the difference between sound and meaning and examines whether or not the comparisons are appropriate between the processes of recognition and production of spoken words. It also describes the similarities and differences in the structure of the recognition and production systems. Finally, the chapter highlights the common issues in recognition and production research, which include the nuances of frequency of occurrence, morphological structure, and phonological structure.
  • Cutler, A. (1995). Spoken-word recognition. In G. Bloothooft, V. Hazan, D. Hubert, & J. Llisterri (Eds.), European studies in phonetics and speech communication (pp. 66-71). Utrecht: OTS.
  • Cutler, A., & McQueen, J. M. (1995). The recognition of lexical units in speech. In B. De Gelder, & J. Morais (Eds.), Speech and reading: A comparative approach (pp. 33-47). Hove, UK: Erlbaum.
  • Cutler, A. (1995). The perception of rhythm in spoken and written language. In J. Mehler, & S. Franck (Eds.), Cognition on cognition (pp. 283-288). Cambridge, MA: MIT Press.
  • Cutler, A. (1995). Universal and Language-Specific in the Development of Speech. Biology International, (Special Issue 33).

    Additional information

    http://www.iubs.org/?id=34
  • Fear, B. D., Cutler, A., & Butterfield, S. (1995). The strong/weak syllable distinction in English. Journal of the Acoustical Society of America, 97, 1893-1904. doi:10.1121/1.412063.

    Abstract

    Strong and weak syllables in English can be distinguished on the basis of vowel quality, of stress, or of both factors. Critical for deciding between these factors are syllables containing unstressed unreduced vowels, such as the first syllable of automata. In this study 12 speakers produced sentences containing matched sets of words with initial vowels ranging from stressed to reduced, at normal and at fast speech rates. Measurements of the duration, intensity, F0, and spectral characteristics of the word-initial vowels showed that unstressed unreduced vowels differed significantly from both stressed and reduced vowels. This result held true across speaker sex and dialect. The vowels produced by one speaker were then cross-spliced across the words within each set, and the resulting words' acceptability was rated by listeners. In general, cross-spliced words were only rated significantly less acceptable than unspliced words when reduced vowels interchanged with any other vowel. Correlations between rated acceptability and acoustic characteristics of the cross-spliced words demonstrated that listeners were attending to duration, intensity, and spectral characteristics. Together these results suggest that unstressed unreduced vowels in English pattern differently from both stressed and reduced vowels, so that no acoustic support for a binary categorical distinction exists; nevertheless, listeners make such a distinction, grouping unstressed unreduced vowels by preference with stressed vowels
  • McQueen, J. M., Cutler, A., Briscoe, T., & Norris, D. (1995). Models of continuous speech recognition and the contents of the vocabulary. Language and Cognitive Processes, 10, 309-331. doi:10.1080/01690969508407098.

    Abstract

    Several models of spoken word recognition postulate that recognition is achieved via a process of competition between lexical hypotheses. Competition not only provides a mechanism for isolated word recognition, it also assists in continuous speech recognition, since it offers a means of segmenting continuous input into individual words. We present statistics on the pattern of occurrence of words embedded in the polysyllabic words of the English vocabulary, showing that an overwhelming majority (84%) of polysyllables have shorter words embedded within them. Positional analyses show that these embeddings are most common at the onsets of the longer word. Although both phonological and syntactic constraints could rule out some embedded words, they do not remove the problem. Lexical competition provides a means of dealing with lexical embedding. It is also supported by a growing body of experimental evidence. We present results which indicate that competition operates both between word candidates that begin at the same point in the input and candidates that begin at different points (McQueen, Norris, & Cutler, 1994, Noms, McQueen, & Cutler, in press). We conclude that lexical competition is an essential component in models of continuous speech recognition.
  • Norris, D., McQueen, J. M., & Cutler, A. (1995). Competition and segmentation in spoken word recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 1209-1228.

    Abstract

    Spoken utterances contain few reliable cues to word boundaries, but listeners nonetheless experience little difficulty identifying words in continuous speech. The authors present data and simulations that suggest that this ability is best accounted for by a model of spoken-word recognition combining competition between alternative lexical candidates and sensitivity to prosodic structure. In a word-spotting experiment, stress pattern effects emerged most clearly when there were many competing lexical candidates for part of the input. Thus, competition between simultaneously active word candidates can modulate the size of prosodic effects, which suggests that spoken-word recognition must be sensitive both to prosodic structure and to the effects of competition. A version of the Shortlist model ( D. G. Norris, 1994b) incorporating the Metrical Segmentation Strategy ( A. Cutler & D. Norris, 1988) accurately simulates the results using a lexicon of more than 25,000 words.
  • Otake, T., Davis, S. M., & Cutler, A. (1995). Listeners’ representations of within-word structure: A cross-linguistic and cross-dialectal investigation. In J. Pardo (Ed.), Proceedings of EUROSPEECH 95: Vol. 3 (pp. 1703-1706). Madrid: European Speech Communication Association.

    Abstract

    Japanese, British English and American English listeners were presented with spoken words in their native language, and asked to mark on a written transcript of each word the first natural division point in the word. The results showed clear and strong patterns of consensus, indicating that listeners have available to them conscious representations of within-word structure. Orthography did not play a strongly deciding role in the results. The patterns of response were at variance with results from on-line studies of speech segmentation, suggesting that the present task taps not those representations used in on-line listening, but levels of representation which may involve much richer knowledge of word-internal structure.
  • Cutler, A., & Fay, D. A. (Eds.). (1978). [Annotated re-issue of R. Meringer and C. Mayer: Versprechen und Verlesen, 1895]. Amsterdam: John Benjamins.
  • Cutler, A., & Fay, D. (1978). Introduction. In A. Cutler, & D. Fay (Eds.), [Annotated re-issue of R. Meringer and C. Mayer: Versprechen und Verlesen, 1895] (pp. ix-xl). Amsterdam: John Benjamins.
  • Cutler, A., & Cooper, W. E. (1978). Phoneme-monitoring in the context of different phonetic sequences. Journal of Phonetics, 6, 221-225.

    Abstract

    The order of some conjoined words is rigidly fixed (e.g. dribs and drabs/*drabs and dribs). Both phonetic and semantic factors can play a role in determining the fixed order. An experiment was conducted to test whether listerners’ reaction times for monitoring a predetermined phoneme are influenced by phonetic constraints on ordering. Two such constraints were investigated: monosyllable-bissyllable and high-low vowel sequences. In English, conjoined words occur in such sequences with much greater frequency than their converses, other factors being equal. Reaction times were significantly shorter for phoneme monitoring in monosyllable-bisyllable sequences than in bisyllable- monosyllable sequences. However, reaction times were not significantly different for high-low vs. low-high vowel sequences.

Share this page