Anne Cutler

Publications

Displaying 1 - 34 of 34
  • Braun, B., Tagliapietra, L., & Cutler, A. (2008). Contrastive utterances make alternatives salient: Cross-modal priming evidence. In Proceedings of Interspeech 2008 (pp. 69-69).

    Abstract

    Sentences with contrastive intonation are assumed to presuppose contextual alternatives to the accented elements. Two cross-modal priming experiments tested in Dutch whether such contextual alternatives are automatically available to listeners. Contrastive associates – but not non- contrastive associates - were facilitated only when primes were produced in sentences with contrastive intonation, indicating that contrastive intonation makes unmentioned contextual alternatives immediately available. Possibly, contrastive contours trigger a “presupposition resolution mechanism” by which these alternatives become salient.
  • Braun, B., Lemhöfer, K., & Cutler, A. (2008). English word stress as produced by English and Dutch speakers: The role of segmental and suprasegmental differences. In Proceedings of Interspeech 2008 (pp. 1953-1953).

    Abstract

    It has been claimed that Dutch listeners use suprasegmental cues (duration, spectral tilt) more than English listeners in distinguishing English word stress. We tested whether this asymmetry also holds in production, comparing the realization of English word stress by native English speakers and Dutch speakers. Results confirmed that English speakers centralize unstressed vowels more, while Dutch speakers of English make more use of suprasegmental differences.
  • Broersma, M., & Cutler, A. (2008). Phantom word activation in L2. System, 36(1), 22-34. doi:10.1016/j.system.2007.11.003.

    Abstract

    L2 listening can involve the phantom activation of words which are not actually in the input. All spoken-word recognition involves multiple concurrent activation of word candidates, with selection of the correct words achieved by a process of competition between them. L2 listening involves more such activation than L1 listening, and we report two studies illustrating this. First, in a lexical decision study, L2 listeners accepted (but L1 listeners did not accept) spoken non-words such as groof or flide as real English words. Second, a priming study demonstrated that the same spoken non-words made recognition of the real words groove, flight easier for L2 (but not L1) listeners, suggesting that, for the L2 listeners only, these real words had been activated by the spoken non-word input. We propose that further understanding of the activation and competition process in L2 lexical processing could lead to new understanding of L2 listening difficulty.
  • Cutler, A., Garcia Lecumberri, M. L., & Cooke, M. (2008). Consonant identification in noise by native and non-native listeners: Effects of local context. Journal of the Acoustical Society of America, 124(2), 1264-1268. doi:10.1121/1.2946707.

    Abstract

    Speech recognition in noise is harder in second (L2) than first languages (L1). This could be because noise disrupts speech processing more in L2 than L1, or because L1 listeners recover better though disruption is equivalent. Two similar prior studies produced discrepant results: Equivalent noise effects for L1 and L2 (Dutch) listeners, versus larger effects for L2 (Spanish) than L1. To explain this, the latter experiment was presented to listeners from the former population. Larger noise effects on consonant identification emerged for L2 (Dutch) than L1 listeners, suggesting that task factors rather than L2 population differences underlie the results discrepancy.
  • Cutler, A., McQueen, J. M., Butterfield, S., & Norris, D. (2008). Prelexically-driven perceptual retuning of phoneme boundaries. In Proceedings of Interspeech 2008 (pp. 2056-2056).

    Abstract

    Listeners heard an ambiguous /f-s/ in nonword contexts where only one of /f/ or /s/ was legal (e.g., frul/*srul or *fnud/snud). In later categorisation of a phonetic continuum from /f/ to /s/, their category boundaries had shifted; hearing -rul led to expanded /f/ categories, -nud expanded /s/. Thus phonotactic sequence information alone induces perceptual retuning of phoneme category boundaries; lexical access is not required.
  • Cutler, A. (2008). The abstract representations in speech processing. Quarterly Journal of Experimental Psychology, 61(11), 1601-1619. doi:10.1080/13803390802218542.

    Abstract

    Speech processing by human listeners derives meaning from acoustic input via intermediate steps involving abstract representations of what has been heard. Recent results from several lines of research are here brought together to shed light on the nature and role of these representations. In spoken-word recognition, representations of phonological form and of conceptual content are dissociable. This follows from the independence of patterns of priming for a word's form and its meaning. The nature of the phonological-form representations is determined not only by acoustic-phonetic input but also by other sources of information, including metalinguistic knowledge. This follows from evidence that listeners can store two forms as different without showing any evidence of being able to detect the difference in question when they listen to speech. The lexical representations are in turn separate from prelexical representations, which are also abstract in nature. This follows from evidence that perceptual learning about speaker-specific phoneme realization, induced on the basis of a few words, generalizes across the whole lexicon to inform the recognition of all words containing the same phoneme. The efficiency of human speech processing has its basis in the rapid execution of operations over abstract representations.
  • Goudbeek, M., Cutler, A., & Smits, R. (2008). Supervised and unsupervised learning of multidimensionally varying nonnative speech categories. Speech Communication, 50(2), 109-125. doi:10.1016/j.specom.2007.07.003.

    Abstract

    The acquisition of novel phonetic categories is hypothesized to be affected by the distributional properties of the input, the relation of the new categories to the native phonology, and the availability of supervision (feedback). These factors were examined in four experiments in which listeners were presented with novel categories based on vowels of Dutch. Distribution was varied such that the categorization depended on the single dimension duration, the single dimension frequency, or both dimensions at once. Listeners were clearly sensitive to the distributional information, but unidimensional contrasts proved easier to learn than multidimensional. The native phonology was varied by comparing Spanish versus American English listeners. Spanish listeners found categorization by frequency easier than categorization by duration, but this was not true of American listeners, whose native vowel system makes more use of duration-based distinctions. Finally, feedback was either available or not; this comparison showed supervised learning to be significantly superior to unsupervised learning.
  • Kim, J., Davis, C., & Cutler, A. (2008). Perceptual tests of rhythmic similarity: II. Syllable rhythm. Language and Speech, 51(4), 343-359. doi:10.1177/0023830908099069.

    Abstract

    To segment continuous speech into its component words, listeners make use of language rhythm; because rhythm differs across languages, so do the segmentation procedures which listeners use. For each of stress-, syllable-and mora-based rhythmic structure, perceptual experiments have led to the discovery of corresponding segmentation procedures. In the case of mora-based rhythm, similar segmentation has been demonstrated in the otherwise unrelated languages Japanese and Telugu; segmentation based on syllable rhythm, however, has been previously demonstrated only for European languages from the Romance family. We here report two target detection experiments in which Korean listeners, presented with speech in Korean and in French, displayed patterns of segmentation like those previously observed in analogous experiments with French listeners. The Korean listeners' accuracy in detecting word-initial target fragments in either language was significantly higher when the fragments corresponded exactly to a syllable in the input than when the fragments were smaller or larger than a syllable. We conclude that Korean and French listeners can call on similar procedures for segmenting speech, and we further propose that perceptual tests of speech segmentation provide a valuable accompaniment to acoustic analyses for establishing languages' rhythmic class membership.
  • Kooijman, V., Johnson, E. K., & Cutler, A. (2008). Reflections on reflections of infant word recognition. In A. D. Friederici, & G. Thierry (Eds.), Early language development: Bridging brain and behaviour (pp. 91-114). Amsterdam: Benjamins.
  • Clifton, Jr., C., Cutler, A., McQueen, J. M., & Van Ooijen, B. (1999). The processing of inflected forms. [Commentary on H. Clahsen: Lexical entries and rules of language.]. Behavioral and Brain Sciences, 22, 1018-1019.

    Abstract

    Clashen proposes two distinct processing routes, for regularly and irregularly inflected forms, respectively, and thus is apparently making a psychological claim. We argue his position, which embodies a strictly linguistic perspective, does not constitute a psychological processing model.
  • Cutler, A., & Clifton, Jr., C. (1999). Comprehending spoken language: A blueprint of the listener. In C. M. Brown, & P. Hagoort (Eds.), The neurocognition of language (pp. 123-166). Oxford University Press.
  • Cutler, A. (1999). Foreword. In Slips of the Ear: Errors in the perception of Casual Conversation (pp. xiii-xv). New York City, NY, USA: Academic Press.
  • Cutler, A., & Otake, T. (1999). Pitch accent in spoken-word recognition in Japanese. Journal of the Acoustical Society of America, 105, 1877-1888.

    Abstract

    Three experiments addressed the question of whether pitch-accent information may be exploited in the process of recognizing spoken words in Tokyo Japanese. In a two-choice classification task, listeners judged from which of two words, differing in accentual structure, isolated syllables had been extracted ~e.g., ka from baka HL or gaka LH!; most judgments were correct, and listeners’ decisions were correlated with the fundamental frequency characteristics of the syllables. In a gating experiment, listeners heard initial fragments of words and guessed what the words were; their guesses overwhelmingly had the same initial accent structure as the gated word even when only the beginning CV of the stimulus ~e.g., na- from nagasa HLL or nagashi LHH! was presented. In addition, listeners were more confident in guesses with the same initial accent structure as the stimulus than in guesses with different accent. In a lexical decision experiment, responses to spoken words ~e.g., ame HL! were speeded by previous presentation of the same word ~e.g., ame HL! but not by previous presentation of a word differing only in accent ~e.g., ame LH!. Together these findings provide strong evidence that accentual information constrains the activation and selection of candidates for spoken-word recognition.
  • Cutler, A. (1999). Prosodische Struktur und Worterkennung bei gesprochener Sprache. In A. D. Friedrici (Ed.), Enzyklopädie der Psychologie: Sprachrezeption (pp. 49-83). Göttingen: Hogrefe.
  • Cutler, A. (1999). Prosody and intonation, processing issues. In R. A. Wilson, & F. C. Keil (Eds.), MIT encyclopedia of the cognitive sciences (pp. 682-683). Cambridge, MA: MIT Press.
  • Cutler, A., & Norris, D. (1999). Sharpening Ockham’s razor (Commentary on W.J.M. Levelt, A. Roelofs & A.S. Meyer: A theory of lexical access in speech production). Behavioral and Brain Sciences, 22, 40-41.

    Abstract

    Language production and comprehension are intimately interrelated; and models of production and comprehension should, we argue, be constrained by common architectural guidelines. Levelt et al.'s target article adopts as guiding principle Ockham's razor: the best model of production is the simplest one. We recommend adoption of the same principle in comprehension, with consequent simplification of some well-known types of models.
  • Cutler, A. (1999). Spoken-word recognition. In R. A. Wilson, & F. C. Keil (Eds.), MIT encyclopedia of the cognitive sciences (pp. 796-798). Cambridge, MA: MIT Press.
  • Cutler, A., Van Ooijen, B., & Norris, D. (1999). Vowels, consonants, and lexical activation. In J. Ohala, Y. Hasegawa, M. Ohala, D. Granville, & A. Bailey (Eds.), Proceedings of the Fourteenth International Congress of Phonetic Sciences: Vol. 3 (pp. 2053-2056). Berkeley: University of California.

    Abstract

    Two lexical decision studies examined the effects of single-phoneme mismatches on lexical activation in spoken-word recognition. One study was carried out in English, and involved spoken primes and visually presented lexical decision targets. The other study was carried out in Dutch, and primes and targets were both presented auditorily. Facilitation was found only for spoken targets preceded immediately by spoken primes; no facilitation occurred when targets were presented visually, or when intervening input occurred between prime and target. The effects of vowel mismatches and consonant mismatches were equivalent.
  • McQueen, J. M., Norris, D., & Cutler, A. (1999). Lexical influence in phonetic decision-making: Evidence from subcategorical mismatches. Journal of Experimental Psychology: Human Perception and Performance, 25, 1363-1389. doi:10.1037/0096-1523.25.5.1363.

    Abstract

    In 5 experiments, listeners heard words and nonwords, some cross-spliced so that they contained acoustic-phonetic mismatches. Performance was worse on mismatching than on matching items. Words cross-spliced with words and words cross-spliced with nonwords produced parallel results. However, in lexical decision and 1 of 3 phonetic decision experiments, performance on nonwords cross-spliced with words was poorer than on nonwords cross-spliced with nonwords. A gating study confirmed that there were misleading coarticulatory cues in the cross-spliced items; a sixth experiment showed that the earlier results were not due to interitem differences in the strength of these cues. Three models of phonetic decision making (the Race model, the TRACE model, and a postlexical model) did not explain the data. A new bottom-up model is outlined that accounts for the findings in terms of lexical involvement at a dedicated decision-making stage.
  • Otake, T., & Cutler, A. (1999). Perception of suprasegmental structure in a nonnative dialect. Journal of Phonetics, 27, 229-253. doi:10.1006/jpho.1999.0095.

    Abstract

    Two experiments examined the processing of Tokyo Japanese pitchaccent distinctions by native speakers of Japanese from two accentlessvariety areas. In both experiments, listeners were presented with Tokyo Japanese speech materials used in an earlier study with Tokyo Japanese listeners, who clearly exploited the pitch-accent information in spokenword recognition. In the "rst experiment, listeners judged from which of two words, di!ering in accentual structure, isolated syllables had been extracted. Both new groups were, overall, as successful at this task as Tokyo Japanese speakers had been, but their response patterns differed from those of the Tokyo Japanese, for instance in that a bias towards H judgments in the Tokyo Japanese responses was weakened in the present groups' responses. In a second experiment, listeners heard word fragments and guessed what the words were; in this task, the speakers from accentless areas again performed significantly above chance, but their responses showed less sensitivity to the information in the input, and greater bias towards vocabulary distribution frequencies, than had been observed with the Tokyo Japanese listeners. The results suggest that experience with a local accentless dialect affects the processing of accent for word recognition in Tokyo Japanese, even for listeners with extensive exposure to Tokyo Japanese.
  • Shattuck-Hufnagel, S., & Cutler, A. (1999). The prosody of speech error corrections revisited. In J. Ohala, Y. Hasegawa, M. Ohala, D. Granville, & A. Bailey (Eds.), Proceedings of the Fourteenth International Congress of Phonetic Sciences: Vol. 2 (pp. 1483-1486). Berkely: University of California.

    Abstract

    A corpus of digitized speech errors is used to compare the prosody of correction patterns for word-level vs. sound-level errors. Results for both peak F0 and perceived prosodic markedness confirm that speakers are more likely to mark corrections of word-level errors than corrections of sound-level errors, and that errors ambiguous between word-level and soundlevel (such as boat for moat) show correction patterns like those for sound level errors. This finding increases the plausibility of the claim that word-sound-ambiguous errors arise at the same level of processing as sound errors that do not form words.
  • Van Donselaar, W., Kuijpers, C. T., & Cutler, A. (1999). Facilitatory effects of vowel epenthesis on word processing in Dutch. Journal of Memory and Language, 41, 59-77. doi:10.1006/jmla.1999.2635.

    Abstract

    We report a series of experiments examining the effects on word processing of insertion of an optional epenthetic vowel in word-final consonant clusters in Dutch. Such epenthesis turns film, for instance, into film. In a word-reversal task listeners treated words with and without epenthesis alike, as monosyllables, suggesting that the variant forms both activate the same canonical representation, that of a monosyllabic word without epenthesis. In both lexical decision and word spotting, response times to recognize words were significantly faster when epenthesis was present than when the word was presented in its canonical form without epenthesis. It is argued that addition of the epenthetic vowel makes the liquid consonants constituting the first member of a cluster more perceptible; a final phoneme-detection experiment confirmed that this was the case. These findings show that a transformed variant of a word, although it contacts the lexicon via the representation of the canonical form, can be more easily perceptible than that canonical form.
  • Butterfield, S., & Cutler, A. (1988). Segmentation errors by human listeners: Evidence for a prosodic segmentation strategy. In W. Ainsworth, & J. Holmes (Eds.), Proceedings of SPEECH ’88: Seventh Symposium of the Federation of Acoustic Societies of Europe: Vol. 3 (pp. 827-833). Edinburgh: Institute of Acoustics.
  • Cutler, A., Mehler, J., Norris, D., & Segui, J. (1988). Limits on bilingualism [Letters to Nature]. Nature, 340, 229-230. doi:10.1038/340229a0.

    Abstract

    SPEECH, in any language, is continuous; speakers provide few reliable cues to the boundaries of words, phrases, or other meaningful units. To understand speech, listeners must divide the continuous speech stream into portions that correspond to such units. This segmentation process is so basic to human language comprehension that psycholinguists long assumed that all speakers would do it in the same way. In previous research1,2, however, we reported that segmentation routines can be language-specific: speakers of French process spoken words syllable by syllable, but speakers of English do not. French has relatively clear syllable boundaries and syllable-based timing patterns, whereas English has relatively unclear syllable boundaries and stress-based timing; thus syllabic segmentation would work more efficiently in the comprehension of French than in the comprehension of English. Our present study suggests that at this level of language processing, there are limits to bilingualism: a bilingual speaker has one and only one basic language.
  • Cutler, A., & Norris, D. (1988). The role of strong syllables in segmentation for lexical access. Journal of Experimental Psychology: Human Perception and Performance, 14, 113-121. doi:10.1037/0096-1523.14.1.113.

    Abstract

    A model of speech segmentation in a stress language is proposed, according to which the occurrence of a strong syllable triggers segmentation of the speech signal, whereas occurrence of a weak syllable does not trigger segmentation. We report experiments in which listeners detected words embedded in nonsense bisyllables more slowly when the bisyllable had two strong syllables than when it had a strong and a weak syllable; mint was detected more slowly in mintayve than in mintesh. According to our proposed model, this result is an effect of segmentation: When the second syllable is strong, it is segmented from the first syllable, and successful detection of the embedded word therefore requires assembly of speech material across a segmentation position. Speech recognition models involving phonemic or syllabic recoding, or based on strictly left-to-right processes, do not predict this result. It is argued that segmentation at strong syllables in continuous speech recognition serves the purpose of detecting the most efficient locations at which to initiate lexical access. (C) 1988 by the American Psychological Association
  • Cutler, A. (1988). The perfect speech error. In L. Hyman, & C. Li (Eds.), Language, speech and mind: Studies in honor of Victoria A. Fromkin (pp. 209-223). London: Croom Helm.
  • Hawkins, J. A., & Cutler, A. (1988). Psycholinguistic factors in morphological asymmetry. In J. A. Hawkins (Ed.), Explaining language universals (pp. 280-317). Oxford: Blackwell.
  • Henderson, L., Coltheart, M., Cutler, A., & Vincent, N. (1988). Preface. Linguistics, 26(4), 519-520. doi:10.1515/ling.1988.26.4.519.
  • Mehta, G., & Cutler, A. (1988). Detection of target phonemes in spontaneous and read speech. Language and Speech, 31, 135-156.

    Abstract

    Although spontaneous speech occurs more frequently in most listeners’ experience than read speech, laboratory studies of human speech recognition typically use carefully controlled materials read from a script. The phonological and prosodic characteristics of spontaneous and read speech differ considerably, however, which suggests that laboratory results may not generalize to the recognition of spontaneous and read speech materials, and their response time to detect word-initial target phonemes was measured. Response were, overall, equally fast in each speech mode. However analysis of effects previously reported in phoneme detection studies revealed significant differences between speech modes. In read speech but not in spontaneous speech, later targets were detected more rapidly than earlier targets, and targets preceded by long words were detected more rapidly than targets preceded by short words. In contrast, in spontaneous speech but not in read speech, targets were detected more rapidly in accented than unaccented words and in strong than in weak syllables. An explanation for this pattern is offered in terms of characteristic prosodic differences between spontaneous and read speech. The results support claim from previous work that listeners pay great attention to prosodic information in the process of recognizing speech.
  • Norris, D., & Cutler, A. (1988). Speech recognition in French and English. MRC News, 39, 30-31.
  • Norris, D., & Cutler, A. (1988). The relative accessibility of phonemes and syllables. Perception and Psychophysics, 43, 541-550. Retrieved from http://www.psychonomic.org/search/view.cgi?id=8530.

    Abstract

    Previous research comparing detection times for syllables and for phonemes has consistently found that syllables are responded to faster than phonemes. This finding poses theoretical problems for strictly hierarchical models of speech recognition, in which smaller units should be able to be identified faster than larger units. However, inspection of the characteristics of previous experiments’stimuli reveals that subjects have been able to respond to syllables on the basis of only a partial analysis of the stimulus. In the present experiment, five groups of subjects listened to identical stimulus material. Phoneme and syllable monitoring under standard conditions was compared with monitoring under conditions in which near matches of target and stimulus occurred on no-response trials. In the latter case, when subjects were forced to analyze each stimulus fully, phonemes were detected faster than syllables.
  • Cutler, A., & Fay, D. A. (Eds.). (1978). [Annotated re-issue of R. Meringer and C. Mayer: Versprechen und Verlesen, 1895]. Amsterdam: John Benjamins.
  • Cutler, A., & Fay, D. (1978). Introduction. In A. Cutler, & D. Fay (Eds.), [Annotated re-issue of R. Meringer and C. Mayer: Versprechen und Verlesen, 1895] (pp. ix-xl). Amsterdam: John Benjamins.
  • Cutler, A., & Cooper, W. E. (1978). Phoneme-monitoring in the context of different phonetic sequences. Journal of Phonetics, 6, 221-225.

    Abstract

    The order of some conjoined words is rigidly fixed (e.g. dribs and drabs/*drabs and dribs). Both phonetic and semantic factors can play a role in determining the fixed order. An experiment was conducted to test whether listerners’ reaction times for monitoring a predetermined phoneme are influenced by phonetic constraints on ordering. Two such constraints were investigated: monosyllable-bissyllable and high-low vowel sequences. In English, conjoined words occur in such sequences with much greater frequency than their converses, other factors being equal. Reaction times were significantly shorter for phoneme monitoring in monosyllable-bisyllable sequences than in bisyllable- monosyllable sequences. However, reaction times were not significantly different for high-low vs. low-high vowel sequences.

Share this page