Anne Cutler

Publications

Displaying 1 - 16 of 16
  • Cutler, A., & Norris, D. (2016). Bottoms up! How top-down pitfalls ensnare speech perception researchers too. Commentary on C. Firestone & B. Scholl: Cognition does not affect perception: Evaluating the evidence for 'top-down' effects. Behavioral and Brain Sciences, e236. doi:10.1017/S0140525X15002745.

    Abstract

    Not only can the pitfalls that Firestone & Scholl (F&S) identify be generalised across multiple studies within the field of visual perception, but also they have general application outside the field wherever perceptual and cognitive processing are compared. We call attention to the widespread susceptibility of research on the perception of speech to versions of the same pitfalls.
  • Norris, D., McQueen, J. M., & Cutler, A. (2016). Prediction, Bayesian inference and feedback in speech recognition. Language, Cognition and Neuroscience, 31(1), 4-18. doi:10.1080/23273798.2015.1081703.

    Abstract

    Speech perception involves prediction, but how is that prediction implemented? In cognitive models prediction has often been taken to imply that there is feedback of activation from lexical to pre-lexical processes as implemented in interactive-activation models (IAMs). We show that simple activation feedback does not actually improve speech recognition. However, other forms of feedback can be beneficial. In particular, feedback can enable the listener to adapt to changing input, and can potentially help the listener to recognise unusual input, or recognise speech in the presence of competing sounds. The common feature of these helpful forms of feedback is that they are all ways of optimising the performance of speech recognition using Bayesian inference. That is, listeners make predictions about speech because speech recognition is optimal in the sense captured in Bayesian models.
  • Broersma, M., & Cutler, A. (2008). Phantom word activation in L2. System, 36(1), 22-34. doi:10.1016/j.system.2007.11.003.

    Abstract

    L2 listening can involve the phantom activation of words which are not actually in the input. All spoken-word recognition involves multiple concurrent activation of word candidates, with selection of the correct words achieved by a process of competition between them. L2 listening involves more such activation than L1 listening, and we report two studies illustrating this. First, in a lexical decision study, L2 listeners accepted (but L1 listeners did not accept) spoken non-words such as groof or flide as real English words. Second, a priming study demonstrated that the same spoken non-words made recognition of the real words groove, flight easier for L2 (but not L1) listeners, suggesting that, for the L2 listeners only, these real words had been activated by the spoken non-word input. We propose that further understanding of the activation and competition process in L2 lexical processing could lead to new understanding of L2 listening difficulty.
  • Cutler, A., Garcia Lecumberri, M. L., & Cooke, M. (2008). Consonant identification in noise by native and non-native listeners: Effects of local context. Journal of the Acoustical Society of America, 124(2), 1264-1268. doi:10.1121/1.2946707.

    Abstract

    Speech recognition in noise is harder in second (L2) than first languages (L1). This could be because noise disrupts speech processing more in L2 than L1, or because L1 listeners recover better though disruption is equivalent. Two similar prior studies produced discrepant results: Equivalent noise effects for L1 and L2 (Dutch) listeners, versus larger effects for L2 (Spanish) than L1. To explain this, the latter experiment was presented to listeners from the former population. Larger noise effects on consonant identification emerged for L2 (Dutch) than L1 listeners, suggesting that task factors rather than L2 population differences underlie the results discrepancy.
  • Cutler, A. (2008). The abstract representations in speech processing. Quarterly Journal of Experimental Psychology, 61(11), 1601-1619. doi:10.1080/13803390802218542.

    Abstract

    Speech processing by human listeners derives meaning from acoustic input via intermediate steps involving abstract representations of what has been heard. Recent results from several lines of research are here brought together to shed light on the nature and role of these representations. In spoken-word recognition, representations of phonological form and of conceptual content are dissociable. This follows from the independence of patterns of priming for a word's form and its meaning. The nature of the phonological-form representations is determined not only by acoustic-phonetic input but also by other sources of information, including metalinguistic knowledge. This follows from evidence that listeners can store two forms as different without showing any evidence of being able to detect the difference in question when they listen to speech. The lexical representations are in turn separate from prelexical representations, which are also abstract in nature. This follows from evidence that perceptual learning about speaker-specific phoneme realization, induced on the basis of a few words, generalizes across the whole lexicon to inform the recognition of all words containing the same phoneme. The efficiency of human speech processing has its basis in the rapid execution of operations over abstract representations.
  • Goudbeek, M., Cutler, A., & Smits, R. (2008). Supervised and unsupervised learning of multidimensionally varying nonnative speech categories. Speech Communication, 50(2), 109-125. doi:10.1016/j.specom.2007.07.003.

    Abstract

    The acquisition of novel phonetic categories is hypothesized to be affected by the distributional properties of the input, the relation of the new categories to the native phonology, and the availability of supervision (feedback). These factors were examined in four experiments in which listeners were presented with novel categories based on vowels of Dutch. Distribution was varied such that the categorization depended on the single dimension duration, the single dimension frequency, or both dimensions at once. Listeners were clearly sensitive to the distributional information, but unidimensional contrasts proved easier to learn than multidimensional. The native phonology was varied by comparing Spanish versus American English listeners. Spanish listeners found categorization by frequency easier than categorization by duration, but this was not true of American listeners, whose native vowel system makes more use of duration-based distinctions. Finally, feedback was either available or not; this comparison showed supervised learning to be significantly superior to unsupervised learning.
  • Kim, J., Davis, C., & Cutler, A. (2008). Perceptual tests of rhythmic similarity: II. Syllable rhythm. Language and Speech, 51(4), 343-359. doi:10.1177/0023830908099069.

    Abstract

    To segment continuous speech into its component words, listeners make use of language rhythm; because rhythm differs across languages, so do the segmentation procedures which listeners use. For each of stress-, syllable-and mora-based rhythmic structure, perceptual experiments have led to the discovery of corresponding segmentation procedures. In the case of mora-based rhythm, similar segmentation has been demonstrated in the otherwise unrelated languages Japanese and Telugu; segmentation based on syllable rhythm, however, has been previously demonstrated only for European languages from the Romance family. We here report two target detection experiments in which Korean listeners, presented with speech in Korean and in French, displayed patterns of segmentation like those previously observed in analogous experiments with French listeners. The Korean listeners' accuracy in detecting word-initial target fragments in either language was significantly higher when the fragments corresponded exactly to a syllable in the input than when the fragments were smaller or larger than a syllable. We conclude that Korean and French listeners can call on similar procedures for segmenting speech, and we further propose that perceptual tests of speech segmentation provide a valuable accompaniment to acoustic analyses for establishing languages' rhythmic class membership.
  • Cutler, A. (2001). Listening to a second language through the ears of a first. Interpreting, 5, 1-23.
  • Cutler, A., & Van Donselaar, W. (2001). Voornaam is not a homophone: Lexical prosody and lexical access in Dutch. Language and Speech, 44, 171-195. doi:10.1177/00238309010440020301.

    Abstract

    Four experiments examined Dutch listeners’ use of suprasegmental information in spoken-word recognition. Isolated syllables excised from minimal stress pairs such as VOORnaam/voorNAAM could be reliably assigned to their source words. In lexical decision, no priming was observed from one member of minimal stress pairs to the other, suggesting that the pairs’ segmental ambiguity was removed by suprasegmental information.Words embedded in nonsense strings were harder to detect if the nonsense string itself formed the beginning of a competing word, but a suprasegmental mismatch to the competing word significantly reduced this inhibition. The same nonsense strings facilitated recognition of the longer words of which they constituted the beginning, butagain the facilitation was significantly reduced by suprasegmental mismatch. Together these results indicate that Dutch listeners effectively exploit suprasegmental cues in recognizing spoken words. Nonetheless, suprasegmental mismatch appears to be somewhat less effective in constraining activation than segmental mismatch.
  • McQueen, J. M., Otake, T., & Cutler, A. (2001). Rhythmic cues and possible-word constraints in Japanese speech segmentation. Journal of Memory and Language, 45, 103-132. doi:10.1006/jmla.2000.2763.

    Abstract

    In two word-spotting experiments, Japanese listeners detected Japanese words faster in vowel contexts (e.g., agura, to sit cross-legged, in oagura) than in consonant contexts (e.g., tagura). In the same experiments, however, listeners spotted words in vowel contexts (e.g., saru, monkey, in sarua) no faster than in moraic nasal contexts (e.g., saruN). In a third word-spotting experiment, words like uni, sea urchin, followed contexts consisting of a consonant-consonant-vowel mora (e.g., gya) plus either a moraic nasal (gyaNuni), a vowel (gyaouni) or a consonant (gyabuni). Listeners spotted words as easily in the first as in the second context (where in each case the target words were aligned with mora boundaries), but found it almost impossible to spot words in the third (where there was a single consonant, such as the [b] in gyabuni, between the beginning of the word and the nearest preceding mora boundary). Three control experiments confirmed that these effects reflected the relative ease of segmentation of the words from their contexts.We argue that the listeners showed sensitivity to the viability of sound sequences as possible Japanese words in the way that they parsed the speech into words. Since single consonants are not possible Japanese words, the listeners avoided lexical parses including single consonants and thus had difficulty recognizing words in the consonant contexts. Even though moraic nasals are also impossible words, they were not difficult segmentation contexts because, as with the vowel contexts, the mora boundaries between the contexts and the target words signaled likely word boundaries. Moraic rhythm appears to provide Japanese listeners with important segmentation cues.
  • McQueen, J. M., & Cutler, A. (2001). Spoken word access processes: An introduction. Language and Cognitive Processes, 16, 469-490. doi:10.1080/01690960143000209.

    Abstract

    We introduce the papers in this special issue by summarising the current major issues in spoken word recognition. We argue that a full understanding of the process of lexical access during speech comprehension will depend on resolving several key representational issues: what is the form of the representations used for lexical access; how is phonological information coded in the mental lexicon; and how is the morphological and semantic information about each word stored? We then discuss a number of distinct access processes: competition between lexical hypotheses; the computation of goodness-of-fit between the signal and stored lexical knowledge; segmentation of continuous speech; whether the lexicon influences prelexical processing through feedback; and the relationship of form-based processing to the processes responsible for deriving an interpretation of a complete utterance. We conclude that further progress may well be made by swapping ideas among the different sub-domains of the discipline.
  • Norris, D., McQueen, J. M., Cutler, A., Butterfield, S., & Kearns, R. (2001). Language-universal constraints on speech segmentation. Language and Cognitive Processes, 16, 637-660. doi:10.1080/01690960143000119.

    Abstract

    Two word-spotting experiments are reported that examine whether the Possible-Word Constraint (PWC) is a language-specific or language-universal strategy for the segmentation of continuous speech. The PWC disfavours parses which leave an impossible residue between the end of a candidate word and any likely location of a word boundary, as cued in the speech signal. The experiments examined cases where the residue was either a CVC syllable with a schwa, or a CV syllable with a lax vowel. Although neither of these syllable contexts is a possible lexical word in English, word-spotting in both contexts was easier than in a context consisting of a single consonant. Two control lexical-decision experiments showed that the word-spotting results reflected the relative segmentation difficulty of the words in different contexts. The PWC appears to be language-universal rather than language-specific.
  • Soto-Faraco, S., Sebastian-Galles, N., & Cutler, A. (2001). Segmental and suprasegmental mismatch in lexical access. Journal of Memory and Language, 45, 412-432. doi:10.1006/jmla.2000.2783.

    Abstract

    Four cross-modal priming experiments in Spanish addressed the role of suprasegmental and segmental information in the activation of spoken words. Listeners heard neutral sentences ending with word fragments (e.g., princi-) and made lexical decisions on letter strings presented at fragment offset. Responses were compared for fragment primes that fully matched the spoken form of the initial portion of target words, versus primes that mismatched in a single element (stress pattern; one vowel; one consonant), versus control primes. Fully matching primes always facilitated lexical decision responses, in comparison to the control condition, while mismatching primes always produced inhibition. The respective strength of the contribution of stress, vowel, and consonant (one feature mismatch or more) information did not differ statistically. The results support a model of spoken-word recognition involving automatic activation of word forms and competition between activated words, in which the activation process is sensitive to all acoustic information relevant to the language’s phonology.
  • Warner, N., Jongman, A., Cutler, A., & Mücke, D. (2001). The phonological status of Dutch epenthetic schwa. Phonology, 18, 387-420. doi:10.1017/S0952675701004213.

    Abstract

    In this paper, we use articulatory measures to determine whether Dutch schwa epenthesis is an abstract phonological process or a concrete phonetic process depending on articulatory timing. We examine tongue position during /l/ before underlying schwa and epenthetic schwa and in coda position. We find greater tip raising before both types of schwa, indicating light /l/ before schwa and dark /l/ in coda position. We argue that the ability of epenthetic schwa to condition the /l/ alternation shows that Dutch schwa epenthesis is an abstract phonological process involving insertion of some unit, and cannot be accounted for within Articulatory Phonology.
  • Cutler, A., Mehler, J., Norris, D., & Segui, J. (1983). A language-specific comprehension strategy [Letters to Nature]. Nature, 304, 159-160. doi:10.1038/304159a0.

    Abstract

    Infants acquire whatever language is spoken in the environment into which they are born. The mental capability of the newborn child is not biased in any way towards the acquisition of one human language rather than another. Because psychologists who attempt to model the process of language comprehension are interested in the structure of the human mind, rather than in the properties of individual languages, strategies which they incorporate in their models are presumed to be universal, not language-specific. In other words, strategies of comprehension are presumed to be characteristic of the human language processing system, rather than, say, the French, English, or Igbo language processing systems. We report here, however, on a comprehension strategy which appears to be used by native speakers of French but not by native speakers of English.
  • Levelt, W. J. M., & Cutler, A. (1983). Prosodic marking in speech repair. Journal of semantics, 2, 205-217. doi:10.1093/semant/2.2.205.

    Abstract

    Spontaneous self-corrections in speech pose a communication problem; the speaker must make clear to the listener not only that the original Utterance was faulty, but where it was faulty and how the fault is to be corrected. Prosodic marking of corrections - making the prosody of the repair noticeably different from that of the original utterance - offers a resource which the speaker can exploit to provide the listener with such information. A corpus of more than 400 spontaneous speech repairs was analysed, and the prosodic characteristics compared with the syntactic and semantic characteristics of each repair. Prosodic marking showed no relationship at all with the syntactic characteristics of repairs. Instead, marking was associated with certain semantic factors: repairs were marked when the original utterance had been actually erroneous, rather than simply less appropriate than the repair; and repairs tended to be marked more often when the set of items encompassing the error and the repair was small rather than when it was large. These findings lend further weight to the characterization of accent as essentially semantic in function.

Share this page