Anne Cutler

Publications

Displaying 1 - 35 of 35
  • Cooper, N., & Cutler, A. (2004). Perception of non-native phonemes in noise. In S. Kin, & M. J. Bae (Eds.), Proceedings of the 8th International Conference on Spoken Language Processing (Interspeech 2004-ICSLP) (pp. 469-472). Seoul: Sunjijn Printing Co.

    Abstract

    We report an investigation of the perception of American English phonemes by Dutch listeners proficient in English. Listeners identified either the consonant or the vowel in most possible English CV and VC syllables. The syllables were embedded in multispeaker babble at three signal-to-noise ratios (16 dB, 8 dB, and 0 dB). Effects of signal-to-noise ratio on vowel and consonant identification are discussed as a function of syllable position and of relationship to the native phoneme inventory. Comparison of the results with previously reported data from native listeners reveals that noise affected the responding of native and non-native listeners similarly.
  • Cutler, A. (2004). Segmentation of spoken language by normal adult listeners. In R. Kent (Ed.), MIT encyclopedia of communication sciences and disorders (pp. 392-395). Cambridge, MA: MIT Press.
  • Cutler, A., Mister, E., Norris, D., & Sebastián-Gallés, N. (2004). La perception de la parole en espagnol: Un cas particulier? In L. Ferrand, & J. Grainger (Eds.), Psycholinguistique cognitive: Essais en l'honneur de Juan Segui (pp. 57-74). Brussels: De Boeck.
  • Cutler, A., Weber, A., Smits, R., & Cooper, N. (2004). Patterns of English phoneme confusions by native and non-native listeners. Journal of the Acoustical Society of America, 116(6), 3668-3678. doi:10.1121/1.1810292.

    Abstract

    Native American English and non-native(Dutch)listeners identified either the consonant or the vowel in all possible American English CV and VC syllables. The syllables were embedded in multispeaker babble at three signal-to-noise ratios(0, 8, and 16 dB). The phoneme identification performance of the non-native listeners was less accurate than that of the native listeners. All listeners were adversely affected by noise. With these isolated syllables, initial segments were harder to identify than final segments. Crucially, the effects of language background and noise did not interact; the performance asymmetry between the native and non-native groups was not significantly different across signal-to-noise ratios. It is concluded that the frequently reported disproportionate difficulty of non-native listening under disadvantageous conditions is not due to a disproportionate increase in phoneme misidentifications.
  • Cutler, A., Norris, D., & Sebastián-Gallés, N. (2004). Phonemic repertoire and similarity within the vocabulary. In S. Kin, & M. J. Bae (Eds.), Proceedings of the 8th International Conference on Spoken Language Processing (Interspeech 2004-ICSLP) (pp. 65-68). Seoul: Sunjijn Printing Co.

    Abstract

    Language-specific differences in the size and distribution of the phonemic repertoire can have implications for the task facing listeners in recognising spoken words. A language with more phonemes will allow shorter words and reduced embedding of short words within longer ones, decreasing the potential for spurious lexical competitors to be activated by speech signals. We demonstrate that this is the case via comparative analyses of the vocabularies of English and Spanish. A language which uses suprasegmental as well as segmental contrasts, however, can substantially reduce the extent of spurious embedding.
  • Cutler, A. (2004). On spoken-word recognition in a second language. Newsletter, American Association of Teachers of Slavic and East European Languages, 47, 15-15.
  • Cutler, A., & Henton, C. G. (2004). There's many a slip 'twixt the cup and the lip. In H. Quené, & V. Van Heuven (Eds.), On speech and Language: Studies for Sieb G. Nooteboom (pp. 37-45). Utrecht: Netherlands Graduate School of Linguistics.

    Abstract

    The retiring academic may look back upon, inter alia, years of conference attendance. Speech error researchers are uniquely fortunate because they can collect data in any situation involving communication; accordingly, the retiring speech error researcher will have collected data at those conferences. We here address the issue of whether error data collected in situations involving conviviality (such as at conferences) is representative of error data in general. Our approach involved a comparison, across three levels of linguistic processing, between a specially constructed Conviviality Sample and the largest existing source of speech error data, the newly available Fromkin Speech Error Database. The results indicate that there are grounds for regarding the data in the Conviviality Sample as a better than average reflection of the true population of all errors committed. These findings encourage us to recommend further data collection in collaboration with like-minded colleagues.
  • Cutler, A. (2004). Twee regels voor academische vorming. In H. Procee (Ed.), Bij die wereld wil ik horen! Zesendertig columns en drie essays over de vorming tot academicus. (pp. 42-45). Amsterdam: Boom.
  • Indefrey, P., & Cutler, A. (2004). Prelexical and lexical processing in listening. In M. Gazzaniga (Ed.), The cognitive neurosciences III. (pp. 759-774). Cambridge, MA: MIT Press.

    Abstract

    This paper presents a meta-analysis of hemodynamic studies on passive auditory language processing. We assess the overlap of hemodynamic activation areas and activation maxima reported in experiments involving the presentation of sentences, words, pseudowords, or sublexical or non-linguistic auditory stimuli. Areas that have been reliably replicated are identified. The results of the meta-analysis are compared to electrophysiological, magnetencephalic (MEG), and clinical findings. It is concluded that auditory language input is processed in a left posterior frontal and bilateral temporal cortical network. Within this network, no processing leve l is related to a single cortical area. The temporal lobes seem to differ with respect to their involvement in post-lexical processing, in that the left temporal lobe has greater involvement than the right, and also in the degree of anatomical specialization for phonological, lexical, and sentence -level processing, with greater overlap on the right contrasting with a higher degree of differentiation on the left.
  • Weber, A., & Cutler, A. (2004). Lexical competition in non-native spoken-word recognition. Journal of Memory and Language, 50(1), 1-25. doi:10.1016/S0749-596X(03)00105-0.

    Abstract

    Four eye-tracking experiments examined lexical competition in non-native spoken-word recognition. Dutch listeners hearing English fixated longer on distractor pictures with names containing vowels that Dutch listeners are likely to confuse with vowels in a target picture name (pencil, given target panda) than on less confusable distractors (beetle, given target bottle). English listeners showed no such viewing time difference. The confusability was asymmetric: given pencil as target, panda did not distract more than distinct competitors. Distractors with Dutch names phonologically related to English target names (deksel, ‘lid,’ given target desk) also received longer fixations than distractors with phonologically unrelated names. Again, English listeners showed no differential effect. With the materials translated into Dutch, Dutch listeners showed no activation of the English words (desk, given target deksel). The results motivate two conclusions: native phonemic categories capture second-language input even when stored representations maintain a second-language distinction; and lexical competition is greater for non-native than for native listeners.
  • Butterfield, S., & Cutler, A. (1990). Intonational cues to word boundaries in clear speech? In Proceedings of the Institute of Acoustics: Vol 12, part 10 (pp. 87-94). St. Albans, Herts.: Institute of Acoustics.
  • Cutler, A., & Butterfield, S. (1990). Durational cues to word boundaries in clear speech. Speech Communication, 9, 485-495.

    Abstract

    One of a listener’s major tasks in understanding continuous speech in segmenting the speech signal into separate words. When listening conditions are difficult, speakers can help listeners by deliberately clear speech. We found that speakers do indeed attempt to makr word boundaries; moreover, they differentiate between word boundaries in a way which suggest they are sensitive to listener needs. Application of heuristic segmentation strategies makes word boundaries before strong syllables easiest for listeners to perceive; but under difficult listening conditions speakers pay more attention to marking word boundaries before weak syllables, i.e. they mark those boundaries which are otherwise particularly hard to perceive.
  • Cutler, A., McQueen, J. M., & Robinson, K. (1990). Elizabeth and John: Sound patterns of men’s and women’s names. Journal of Linguistics, 26, 471-482. doi:10.1017/S0022226700014754.
  • Cutler, A. (1990). Exploiting prosodic probabilities in speech segmentation. In G. Altmann (Ed.), Cognitive models of speech processing: Psycholinguistic and computational perspectives (pp. 105-121). Cambridge, MA: MIT Press.
  • Cutler, A. (1990). From performance to phonology: Comments on Beckman and Edwards's paper. In J. Kingston, & M. Beckman (Eds.), Papers in laboratory phonology I: Between the grammar and physics of speech (pp. 208-214). Cambridge: Cambridge University Press.
  • Cutler, A., & Scott, D. R. (1990). Speaker sex and perceived apportionment of talk. Applied Psycholinguistics, 11, 253-272. doi:10.1017/S0142716400008882.

    Abstract

    It is a widely held belief that women talk more than men; but experimental evidence has suggested that this belief is mistaken. The present study investigated whether listener bias contributes to this mistake. Dialogues were recorded in mixed-sex and single-sex versions, and male and female listeners judged the proportions of talk contributed to the dialogues by each participant. Female contributions to mixed-sex dialogues were rated as greater than male contributions by both male and female listeners. Female contributions were more likely to be overestimated when they were speaking a dialogue part perceived as probably female than when they were speaking a dialogue part perceived as probably male. It is suggested that the misestimates are due to a complex of factors that may involve both perceptual effects such as misjudgment of rates of speech and sociological effects such as attitudes to social roles and perception of power relations.
  • Cutler, A. (1990). Syllabic lengthening as a word boundary cue. In R. Seidl (Ed.), Proceedings of the 3rd Australian International Conference on Speech Science and Technology (pp. 324-328). Canberra: Australian Speech Science and Technology Association.

    Abstract

    Bisyllabic sequences which could be interpreted as one word or two were produced in sentence contexts by a trained speaker, and syllabic durations measured. Listeners judged whether the bisyllables, excised from context, were one word or two. The proportion of two-word choices correlated positively with measured duration, but only for bisyllables stressed on the second syllable. The results may suggest a limit for listener sensitivity to syllabic lengthening as a word boundary cue.
  • Cutler, A., Norris, D., & Van Ooijen, B. (1990). Vowels as phoneme detection targets. In Proceedings of the First International Conference on Spoken Language Processing (pp. 581-584).

    Abstract

    Phoneme detection is a psycholinguistic task in which listeners' response time to detect the presence of a pre-specified phoneme target is measured. Typically, detection tasks have used consonant targets. This paper reports two experiments in which subjects responded to vowels as phoneme detection targets. In the first experiment, targets occurred in real words, in the second in nonsense words. Response times were long by comparison with consonantal targets. Targets in initial syllables were responded to much more slowly than targets in second syllables. Strong vowels were responded to faster than reduced vowels in real words but not in nonwords. These results suggest that the process of phoneme detection produces different results for vowels and for consonants. We discuss possible explanations for this difference, in particular the possibility of language-specificity.
  • Mehler, J., & Cutler, A. (1990). Psycholinguistic implications of phonological diversity among languages. In M. Piattelli-Palmerini (Ed.), Cognitive science in Europe: Issues and trends (pp. 119-134). Rome: Golem.
  • Cutler, A. (1989). Auditory lexical access: Where do we start? In W. Marslen-Wilson (Ed.), Lexical representation and process (pp. 342-356). Cambridge, MA: MIT Press.

    Abstract

    The lexicon, considered as a component of the process of recognizing speech, is a device that accepts a sound image as input and outputs meaning. Lexical access is the process of formulating an appropriate input and mapping it onto an entry in the lexicon's store of sound images matched with their meanings. This chapter addresses the problems of auditory lexical access from continuous speech. The central argument to be proposed is that utterance prosody plays a crucial role in the access process. Continuous listening faces problems that are not present in visual recognition (reading) or in noncontinuous recognition (understanding isolated words). Aspects of utterance prosody offer a solution to these particular problems.
  • Cutler, A., & Butterfield, S. (1989). Natural speech cues to word segmentation under difficult listening conditions. In J. Tubach, & J. Mariani (Eds.), Proceedings of Eurospeech 89: European Conference on Speech Communication and Technology: Vol. 2 (pp. 372-375). Edinburgh: CEP Consultants.

    Abstract

    One of a listener's major tasks in understanding continuous speech is segmenting the speech signal into separate words. When listening conditions are difficult, speakers can help listeners by deliberately speaking more clearly. In three experiments, we examined how word boundaries are produced in deliberately clear speech. We found that speakers do indeed attempt to mark word boundaries; moreover, they differentiate between word boundaries in a way which suggests they are sensitive to listener needs. Application of heuristic segmentation strategies makes word boundaries before strong syllables easiest for listeners to perceive; but under difficult listening conditions speakers pay more attention to marking word boundaries before weak syllables, i.e. they mark those boundaries which are otherwise particularly hard to perceive.
  • Cutler, A., Howard, D., & Patterson, K. E. (1989). Misplaced stress on prosody: A reply to Black and Byng. Cognitive Neuropsychology, 6, 67-83.

    Abstract

    The recent claim by Black and Byng (1986) that lexical access in reading is subject to prosodic constraints is examined and found to be unsupported. The evidence from impaired reading which Black and Byng report is based on poorly controlled stimulus materials and is inadequately analysed and reported. An alternative explanation of their findings is proposed, and new data are reported for which this alternative explanation can account but their model cannot. Finally, their proposal is shown to be theoretically unmotivated and in conflict with evidence from normal reading.
  • Cutler, A. (1989). Straw modules [Commentary/Massaro: Speech perception]. Behavioral and Brain Sciences, 12, 760-762.
  • Cutler, A. (1989). The new Victorians. New Scientist, (1663), 66.
  • Patterson, R. D., & Cutler, A. (1989). Auditory preprocessing and recognition of speech. In A. Baddeley, & N. Bernsen (Eds.), Research directions in cognitive science: A european perspective: Vol. 1. Cognitive psychology (pp. 23-60). London: Erlbaum.
  • Smith, M. R., Cutler, A., Butterfield, S., & Nimmo-Smith, I. (1989). The perception of rhythm and word boundaries in noise-masked speech. Journal of Speech and Hearing Research, 32, 912-920.

    Abstract

    The present experiment tested the suggestion that human listeners may exploit durational information in speech to parse continuous utterances into words. Listeners were presented with six-syllable unpredictable utterances under noise-masking, and were required to judge between alternative word strings as to which best matched the rhythm of the masked utterances. For each utterance there were four alternative strings: (a) an exact rhythmic and word boundary match, (b) a rhythmic mismatch, and (c) two utterances with the same rhythm as the masked utterance, but different word boundary locations. Listeners were clearly able to perceive the rhythm of the masked utterances: The rhythmic mismatch was chosen significantly less often than any other alternative. Within the three rhythmically matched alternatives, the exact match was chosen significantly more often than either word boundary mismatch. Thus, listeners both perceived speech rhythm and used durational cues effectively to locate the position of word boundaries.
  • Butterfield, S., & Cutler, A. (1988). Segmentation errors by human listeners: Evidence for a prosodic segmentation strategy. In W. Ainsworth, & J. Holmes (Eds.), Proceedings of SPEECH ’88: Seventh Symposium of the Federation of Acoustic Societies of Europe: Vol. 3 (pp. 827-833). Edinburgh: Institute of Acoustics.
  • Cutler, A., Mehler, J., Norris, D., & Segui, J. (1988). Limits on bilingualism [Letters to Nature]. Nature, 340, 229-230. doi:10.1038/340229a0.

    Abstract

    SPEECH, in any language, is continuous; speakers provide few reliable cues to the boundaries of words, phrases, or other meaningful units. To understand speech, listeners must divide the continuous speech stream into portions that correspond to such units. This segmentation process is so basic to human language comprehension that psycholinguists long assumed that all speakers would do it in the same way. In previous research1,2, however, we reported that segmentation routines can be language-specific: speakers of French process spoken words syllable by syllable, but speakers of English do not. French has relatively clear syllable boundaries and syllable-based timing patterns, whereas English has relatively unclear syllable boundaries and stress-based timing; thus syllabic segmentation would work more efficiently in the comprehension of French than in the comprehension of English. Our present study suggests that at this level of language processing, there are limits to bilingualism: a bilingual speaker has one and only one basic language.
  • Cutler, A., & Norris, D. (1988). The role of strong syllables in segmentation for lexical access. Journal of Experimental Psychology: Human Perception and Performance, 14, 113-121. doi:10.1037/0096-1523.14.1.113.

    Abstract

    A model of speech segmentation in a stress language is proposed, according to which the occurrence of a strong syllable triggers segmentation of the speech signal, whereas occurrence of a weak syllable does not trigger segmentation. We report experiments in which listeners detected words embedded in nonsense bisyllables more slowly when the bisyllable had two strong syllables than when it had a strong and a weak syllable; mint was detected more slowly in mintayve than in mintesh. According to our proposed model, this result is an effect of segmentation: When the second syllable is strong, it is segmented from the first syllable, and successful detection of the embedded word therefore requires assembly of speech material across a segmentation position. Speech recognition models involving phonemic or syllabic recoding, or based on strictly left-to-right processes, do not predict this result. It is argued that segmentation at strong syllables in continuous speech recognition serves the purpose of detecting the most efficient locations at which to initiate lexical access. (C) 1988 by the American Psychological Association
  • Cutler, A. (1988). The perfect speech error. In L. Hyman, & C. Li (Eds.), Language, speech and mind: Studies in honor of Victoria A. Fromkin (pp. 209-223). London: Croom Helm.
  • Hawkins, J. A., & Cutler, A. (1988). Psycholinguistic factors in morphological asymmetry. In J. A. Hawkins (Ed.), Explaining language universals (pp. 280-317). Oxford: Blackwell.
  • Henderson, L., Coltheart, M., Cutler, A., & Vincent, N. (1988). Preface. Linguistics, 26(4), 519-520. doi:10.1515/ling.1988.26.4.519.
  • Mehta, G., & Cutler, A. (1988). Detection of target phonemes in spontaneous and read speech. Language and Speech, 31, 135-156.

    Abstract

    Although spontaneous speech occurs more frequently in most listeners’ experience than read speech, laboratory studies of human speech recognition typically use carefully controlled materials read from a script. The phonological and prosodic characteristics of spontaneous and read speech differ considerably, however, which suggests that laboratory results may not generalize to the recognition of spontaneous and read speech materials, and their response time to detect word-initial target phonemes was measured. Response were, overall, equally fast in each speech mode. However analysis of effects previously reported in phoneme detection studies revealed significant differences between speech modes. In read speech but not in spontaneous speech, later targets were detected more rapidly than earlier targets, and targets preceded by long words were detected more rapidly than targets preceded by short words. In contrast, in spontaneous speech but not in read speech, targets were detected more rapidly in accented than unaccented words and in strong than in weak syllables. An explanation for this pattern is offered in terms of characteristic prosodic differences between spontaneous and read speech. The results support claim from previous work that listeners pay great attention to prosodic information in the process of recognizing speech.
  • Norris, D., & Cutler, A. (1988). Speech recognition in French and English. MRC News, 39, 30-31.
  • Norris, D., & Cutler, A. (1988). The relative accessibility of phonemes and syllables. Perception and Psychophysics, 43, 541-550. Retrieved from http://www.psychonomic.org/search/view.cgi?id=8530.

    Abstract

    Previous research comparing detection times for syllables and for phonemes has consistently found that syllables are responded to faster than phonemes. This finding poses theoretical problems for strictly hierarchical models of speech recognition, in which smaller units should be able to be identified faster than larger units. However, inspection of the characteristics of previous experiments’stimuli reveals that subjects have been able to respond to syllables on the basis of only a partial analysis of the stimulus. In the present experiment, five groups of subjects listened to identical stimulus material. Phoneme and syllable monitoring under standard conditions was compared with monitoring under conditions in which near matches of target and stimulus occurred on no-response trials. In the latter case, when subjects were forced to analyze each stimulus fully, phonemes were detected faster than syllables.

Share this page