Anne Cutler

Publications

Displaying 1 - 15 of 15
  • Cutler, A., & Norris, D. (2016). Bottoms up! How top-down pitfalls ensnare speech perception researchers too. Commentary on C. Firestone & B. Scholl: Cognition does not affect perception: Evaluating the evidence for 'top-down' effects. Behavioral and Brain Sciences, e236. doi:10.1017/S0140525X15002745.

    Abstract

    Not only can the pitfalls that Firestone & Scholl (F&S) identify be generalised across multiple studies within the field of visual perception, but also they have general application outside the field wherever perceptual and cognitive processing are compared. We call attention to the widespread susceptibility of research on the perception of speech to versions of the same pitfalls.
  • Norris, D., McQueen, J. M., & Cutler, A. (2016). Prediction, Bayesian inference and feedback in speech recognition. Language, Cognition and Neuroscience, 31(1), 4-18. doi:10.1080/23273798.2015.1081703.

    Abstract

    Speech perception involves prediction, but how is that prediction implemented? In cognitive models prediction has often been taken to imply that there is feedback of activation from lexical to pre-lexical processes as implemented in interactive-activation models (IAMs). We show that simple activation feedback does not actually improve speech recognition. However, other forms of feedback can be beneficial. In particular, feedback can enable the listener to adapt to changing input, and can potentially help the listener to recognise unusual input, or recognise speech in the presence of competing sounds. The common feature of these helpful forms of feedback is that they are all ways of optimising the performance of speech recognition using Bayesian inference. That is, listeners make predictions about speech because speech recognition is optimal in the sense captured in Bayesian models.
  • Cooper, N., Cutler, A., & Wales, R. (2002). Constraints of lexical stress on lexical access in English: Evidence from native and non-native listeners. Language and Speech, 45(3), 207-228.

    Abstract

    Four cross-modal priming experiments and two forced-choice identification experiments investigated the use of suprasegmental cues to stress in the recognition of spoken English words, by native (English-speaking) and non- native (Dutch) listeners. Previous results had indicated that suprasegmental information was exploited in lexical access by Dutch but not by English listeners. For both listener groups, recognition of visually presented target words was faster, in comparison to a control condition, after stress-matching spoken primes, either monosyllabic (mus- from MUsic /muSEum) or bisyl labic (admi- from ADmiral/admiRAtion). For native listeners, the effect of stress-mismatching bisyllabic primes was not different from that of control primes, but mismatching monosyllabic primes produced partial facilitation. For non-native listeners, both bisyllabic and monosyllabic stress-mismatching primes produced partial facilitation. Native English listeners thus can exploit suprasegmental information in spoken-word recognition, but information from two syllables is used more effectively than information from one syllable. Dutch listeners are less proficient at using suprasegmental information in English than in their native language, but, as in their native language, use mono- and bisyllabic information to an equal extent. In forced-choice identification, Dutch listeners outperformed native listeners at correctly assigning a monosyllabic fragment (e.g., mus-) to one of two words differing in stress.
  • Cutler, A. (2002). Native listeners. European Review, 10(1), 27-41. doi:10.1017/S1062798702000030.

    Abstract

    Becoming a native listener is the necessary precursor to becoming a native speaker. Babies in the first year of life undertake a remarkable amount of work; by the time they begin to speak, they have perceptually mastered the phonological repertoire and phoneme co-occurrence probabilities of the native language, and they can locate familiar word-forms in novel continuous-speech contexts. The skills acquired at this early stage form a necessary part of adult listening. However, the same native listening skills also underlie problems in listening to a late-acquired non-native language, accounting for why in such a case listening (an innate ability) is sometimes paradoxically more difficult than, for instance, reading (a learned ability).
  • Cutler, A., & Otake, T. (2002). Rhythmic categories in spoken-word recognition. Journal of Memory and Language, 46(2), 296-322. doi:10.1006/jmla.2001.2814.

    Abstract

    Rhythmic categories such as morae in Japanese or stress units in English play a role in the perception of spoken language. We examined this role in Japanese, since recent evidence suggests that morae may intervene as structural units in word recognition. First, we found that traditional puns more often substituted part of a mora than a whole mora. Second, when listeners reconstructed distorted words, e.g. panorama from panozema, responses were faster and more accurate when only a phoneme was distorted (panozama, panorema) than when a whole CV mora was distorted (panozema). Third, lexical decisions on the same nonwords were better predicted by duration and number of phonemes from nonword uniqueness point to word end than by number of morae. Our results indicate no role for morae in early spoken-word processing; we propose that rhythmic categories constrain not initial lexical activation but subsequent processes of speech segmentation and selection among word candidates.
  • Cutler, A., Demuth, K., & McQueen, J. M. (2002). Universality versus language-specificity in listening to running speech. Psychological Science, 13(3), 258-262. doi:10.1111/1467-9280.00447.

    Abstract

    Recognizing spoken language involves automatic activation of multiple candidate words. The process of selection between candidates is made more efficient by inhibition of embedded words (like egg in beg) that leave a portion of the input stranded (here, b). Results from European languages suggest that this inhibition occurs when consonants are stranded but not when syllables are stranded. The reason why leftover syllables do not lead to inhibition could be that in principle they might themselves be words; in European languages, a syllable can be a word. In Sesotho (a Bantu language), however, a single syllable cannot be a word. We report that in Sesotho, word recognition is inhibited by stranded consonants, but stranded monosyllables produce no more difficulty than stranded bisyllables (which could be Sesotho words). This finding suggests that the viability constraint which inhibits spurious embedded word candidates is not sensitive to language-specific word structure, but is universal.
  • Norris, D., McQueen, J. M., & Cutler, A. (2002). Bias effects in facilitatory phonological priming. Memory & Cognition, 30(3), 399-411.

    Abstract

    In four experiments, we examined the facilitation that occurs when spoken-word targets rhyme with preceding spoken primes. In Experiment 1, listeners’ lexical decisions were faster to words following rhyming words (e.g., ramp–LAMP) than to words following unrelated primes (e.g., pink–LAMP). No facilitation was observed for nonword targets. Targets that almost rhymed with their primes (foils; e.g., bulk–SULSH) were included in Experiment 2; facilitation for rhyming targets was severely attenuated. Experiments 3 and 4 were single-word shadowing variants of the earlier experiments. There was facilitation for both rhyming words and nonwords; the presence of foils had no significant influence on the priming effect. A major component of the facilitation in lexical decision appears to be strategic: Listeners are biased to say “yes” to targets that rhyme with their primes, unless foils discourage this strategy. The nonstrategic component of phonological facilitation may reflect speech perception processes that operate prior to lexical access.
  • Spinelli, E., Cutler, A., & McQueen, J. M. (2002). Resolution of liaison for lexical access in French. Revue Française de Linguistique Appliquée, 7, 83-96.

    Abstract

    Spoken word recognition involves automatic activation of lexical candidates compatible with the perceived input. In running speech, words abut one another without intervening gaps, and syllable boundaries can mismatch with word boundaries. For instance, liaison in ’petit agneau’ creates a syllable beginning with a consonant although ’agneau’ begins with a vowel. In two cross-modal priming experiments we investigate how French listeners recognise words in liaison environments. These results suggest that the resolution of liaison in part depends on acoustic cues which distinguish liaison from non-liaison consonants, and in part on the availability of lexical support for a liaison interpretation.
  • Boland, J. E., & Cutler, A. (1995). Interaction with autonomy: Defining multiple output models in psycholinguistic theory. Working Papers in Linguistic, 45, 1-10. Retrieved from http://hdl.handle.net/2066/15768.

    Abstract

    There are currently a number of psycholinguistic models in which processing at a particular level of representation is characterized by the generation of multiple outputs, with resolution involving the use of information from higher levels of processing. Surprisingly, models with this architecture have been characterized as autonomous within the domain of word recognition and as interactive within the domain of sentence processing. We suggest that the apparent internal confusion is not, as might be assumed, due to fundamental differences between lexical and syntactic processing. Rather, we believe that the labels in each domain were chosen in order to obtain maximal contrast between a new model and the model or models that were currently dominating the field.
  • Boland, J. E., & Cutler, A. (1995). Interaction with autonomy: Multiple Output models and the inadequacy of the Great Divide. Cognition, 58, 309-320. doi:10.1016/0010-0277(95)00684-2.

    Abstract

    There are currently a number of psycholinguistic models in which processing at a particular level of representation is characterized by the generation of multiple outputs, with resolution - but not generation - involving the use of information from higher levels of processing. Surprisingly, models with this architecture have been characterized as autonomous within the domain of word recognition but as interactive within the domain of sentence processing. We suggest that the apparent confusion is not, as might be assumed, due to fundamental differences between lexical and syntactic processing. Rather, we believe that the labels in each domain were chosen in order to obtain maximal contrast between a new model and the model or models that were currently dominating the field. The contradiction serves to highlight the inadequacy of a simple autonomy/interaction dichotomy for characterizing the architectures of current processing models.
  • Fear, B. D., Cutler, A., & Butterfield, S. (1995). The strong/weak syllable distinction in English. Journal of the Acoustical Society of America, 97, 1893-1904. doi:10.1121/1.412063.

    Abstract

    Strong and weak syllables in English can be distinguished on the basis of vowel quality, of stress, or of both factors. Critical for deciding between these factors are syllables containing unstressed unreduced vowels, such as the first syllable of automata. In this study 12 speakers produced sentences containing matched sets of words with initial vowels ranging from stressed to reduced, at normal and at fast speech rates. Measurements of the duration, intensity, F0, and spectral characteristics of the word-initial vowels showed that unstressed unreduced vowels differed significantly from both stressed and reduced vowels. This result held true across speaker sex and dialect. The vowels produced by one speaker were then cross-spliced across the words within each set, and the resulting words' acceptability was rated by listeners. In general, cross-spliced words were only rated significantly less acceptable than unspliced words when reduced vowels interchanged with any other vowel. Correlations between rated acceptability and acoustic characteristics of the cross-spliced words demonstrated that listeners were attending to duration, intensity, and spectral characteristics. Together these results suggest that unstressed unreduced vowels in English pattern differently from both stressed and reduced vowels, so that no acoustic support for a binary categorical distinction exists; nevertheless, listeners make such a distinction, grouping unstressed unreduced vowels by preference with stressed vowels
  • McQueen, J. M., Cutler, A., Briscoe, T., & Norris, D. (1995). Models of continuous speech recognition and the contents of the vocabulary. Language and Cognitive Processes, 10, 309-331. doi:10.1080/01690969508407098.

    Abstract

    Several models of spoken word recognition postulate that recognition is achieved via a process of competition between lexical hypotheses. Competition not only provides a mechanism for isolated word recognition, it also assists in continuous speech recognition, since it offers a means of segmenting continuous input into individual words. We present statistics on the pattern of occurrence of words embedded in the polysyllabic words of the English vocabulary, showing that an overwhelming majority (84%) of polysyllables have shorter words embedded within them. Positional analyses show that these embeddings are most common at the onsets of the longer word. Although both phonological and syntactic constraints could rule out some embedded words, they do not remove the problem. Lexical competition provides a means of dealing with lexical embedding. It is also supported by a growing body of experimental evidence. We present results which indicate that competition operates both between word candidates that begin at the same point in the input and candidates that begin at different points (McQueen, Norris, & Cutler, 1994, Noms, McQueen, & Cutler, in press). We conclude that lexical competition is an essential component in models of continuous speech recognition.
  • Norris, D., McQueen, J. M., & Cutler, A. (1995). Competition and segmentation in spoken word recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 1209-1228.

    Abstract

    Spoken utterances contain few reliable cues to word boundaries, but listeners nonetheless experience little difficulty identifying words in continuous speech. The authors present data and simulations that suggest that this ability is best accounted for by a model of spoken-word recognition combining competition between alternative lexical candidates and sensitivity to prosodic structure. In a word-spotting experiment, stress pattern effects emerged most clearly when there were many competing lexical candidates for part of the input. Thus, competition between simultaneously active word candidates can modulate the size of prosodic effects, which suggests that spoken-word recognition must be sensitive both to prosodic structure and to the effects of competition. A version of the Shortlist model ( D. G. Norris, 1994b) incorporating the Metrical Segmentation Strategy ( A. Cutler & D. Norris, 1988) accurately simulates the results using a lexicon of more than 25,000 words.
  • Cutler, A., Mehler, J., Norris, D., & Segui, J. (1983). A language-specific comprehension strategy [Letters to Nature]. Nature, 304, 159-160. doi:10.1038/304159a0.

    Abstract

    Infants acquire whatever language is spoken in the environment into which they are born. The mental capability of the newborn child is not biased in any way towards the acquisition of one human language rather than another. Because psychologists who attempt to model the process of language comprehension are interested in the structure of the human mind, rather than in the properties of individual languages, strategies which they incorporate in their models are presumed to be universal, not language-specific. In other words, strategies of comprehension are presumed to be characteristic of the human language processing system, rather than, say, the French, English, or Igbo language processing systems. We report here, however, on a comprehension strategy which appears to be used by native speakers of French but not by native speakers of English.
  • Levelt, W. J. M., & Cutler, A. (1983). Prosodic marking in speech repair. Journal of semantics, 2, 205-217. doi:10.1093/semant/2.2.205.

    Abstract

    Spontaneous self-corrections in speech pose a communication problem; the speaker must make clear to the listener not only that the original Utterance was faulty, but where it was faulty and how the fault is to be corrected. Prosodic marking of corrections - making the prosody of the repair noticeably different from that of the original utterance - offers a resource which the speaker can exploit to provide the listener with such information. A corpus of more than 400 spontaneous speech repairs was analysed, and the prosodic characteristics compared with the syntactic and semantic characteristics of each repair. Prosodic marking showed no relationship at all with the syntactic characteristics of repairs. Instead, marking was associated with certain semantic factors: repairs were marked when the original utterance had been actually erroneous, rather than simply less appropriate than the repair; and repairs tended to be marked more often when the set of items encompassing the error and the repair was small rather than when it was large. These findings lend further weight to the characterization of accent as essentially semantic in function.

Share this page