Anne Cutler

Publications

Displaying 1 - 11 of 11
  • Cutler, A., & Norris, D. (2016). Bottoms up! How top-down pitfalls ensnare speech perception researchers too. Commentary on C. Firestone & B. Scholl: Cognition does not affect perception: Evaluating the evidence for 'top-down' effects. Behavioral and Brain Sciences, e236. doi:10.1017/S0140525X15002745.

    Abstract

    Not only can the pitfalls that Firestone & Scholl (F&S) identify be generalised across multiple studies within the field of visual perception, but also they have general application outside the field wherever perceptual and cognitive processing are compared. We call attention to the widespread susceptibility of research on the perception of speech to versions of the same pitfalls.
  • Norris, D., McQueen, J. M., & Cutler, A. (2016). Prediction, Bayesian inference and feedback in speech recognition. Language, Cognition and Neuroscience, 31(1), 4-18. doi:10.1080/23273798.2015.1081703.

    Abstract

    Speech perception involves prediction, but how is that prediction implemented? In cognitive models prediction has often been taken to imply that there is feedback of activation from lexical to pre-lexical processes as implemented in interactive-activation models (IAMs). We show that simple activation feedback does not actually improve speech recognition. However, other forms of feedback can be beneficial. In particular, feedback can enable the listener to adapt to changing input, and can potentially help the listener to recognise unusual input, or recognise speech in the presence of competing sounds. The common feature of these helpful forms of feedback is that they are all ways of optimising the performance of speech recognition using Bayesian inference. That is, listeners make predictions about speech because speech recognition is optimal in the sense captured in Bayesian models.
  • Cutler, A., Sebastian-Galles, N., Soler-Vilageliu, O., & Van Ooijen, B. (2000). Constraints of vowels and consonants on lexical selection: Cross-linguistic comparisons. Memory & Cognition, 28, 746-755.

    Abstract

    Languages differ in the constitution of their phonemic repertoire and in the relative distinctiveness of phonemes within the repertoire. In the present study, we asked whether such differences constrain spoken-word recognition, via two word reconstruction experiments, in which listeners turned non-words into real words by changing single sounds. The experiments were carried out in Dutch (which has a relatively balanced vowel-consonant ratio and many similar vowels) and in Spanish (which has many more consonants than vowels and high distinctiveness among the vowels). Both Dutch and Spanish listeners responded significantly faster and more accurately when required to change vowels as opposed to consonants; when allowed to change any phoneme, they more often altered vowels than consonants. Vowel information thus appears to constrain lexical selection less tightly (allow more potential candidates) than does consonant information, independent of language-specific phoneme repertoire and of relative distinctiveness of vowels.
  • Cutler, A., & Van de Weijer, J. (2000). De ontdekking van de eerste woorden. Stem-, Spraak- en Taalpathologie, 9, 245-259.

    Abstract

    Spraak is continu, er zijn geen betrouwbare signalen waardoor de luisteraar weet waar het ene woord eindigt en het volgende begint. Voor volwassen luisteraars is het segmenteren van gesproken taal in afzonderlijke woorden dus niet onproblematisch, maar voor een kind dat nog geen woordenschat bezit, vormt de continuïteit van spraak een nog grotere uitdaging. Desalniettemin produceren de meeste kinderen hun eerste herkenbare woorden rond het begin van het tweede levensjaar. Aan deze vroege spraakproducties gaat een formidabele perceptuele prestatie vooraf. Tijdens het eerste levensjaar - met name gedurende de tweede helft - ontwikkelt de spraakperceptie zich van een algemeen fonetisch discriminatievermogen tot een selectieve gevoeligheid voor de fonologische contrasten die in de moedertaal voorkomen. Recent onderzoek heeft verder aangetoond dat kinderen, lang voordat ze ook maar een enkel woord kunnen zeggen, in staat zijn woorden die kenmerkend zijn voor hun moedertaal te onderscheiden van woorden die dat niet zijn. Bovendien kunnen ze woorden die eerst in isolatie werden aangeboden herkennen in een continue spraakcontext. Het dagelijkse taalaanbod aan een kind van deze leeftijd maakt het in zekere zin niet gemakkelijk, bijvoorbeeld doordat de meeste woorden niet in isolatie voorkomen. Toch wordt het kind ook wel houvast geboden, onder andere doordat het woordgebruik beperkt is.
  • Houston, D. M., Jusczyk, P. W., Kuijpers, C., Coolen, R., & Cutler, A. (2000). Cross-language word segmentation by 9-month-olds. Psychonomic Bulletin & Review, 7, 504-509.

    Abstract

    Dutch-learning and English-learning 9-month-olds were tested, using the Headturn Preference Procedure, for their ability to segment Dutch words with strong/weak stress patterns from fluent Dutch speech. This prosodic pattern is highly typical for words of both languages. The infants were familiarized with pairs of words and then tested on four passages, two that included the familiarized words and two that did not. Both the Dutch- and the English-learning infants gave evidence of segmenting the targets from the passages, to an equivalent degree. Thus, English-learning infants are able to extract words from fluent speech in a language that is phonetically different from English. We discuss the possibility that this cross-language segmentation ability is aided by the similarity of the typical rhythmic structure of Dutch and English words.
  • Norris, D., McQueen, J. M., & Cutler, A. (2000). Feedback on feedback on feedback: It’s feedforward. (Response to commentators). Behavioral and Brain Sciences, 23, 352-370.

    Abstract

    The central thesis of the target article was that feedback is never necessary in spoken word recognition. The commentaries present no new data and no new theoretical arguments which lead us to revise this position. In this response we begin by clarifying some terminological issues which have lead to a number of significant misunderstandings. We provide some new arguments to support our case that the feedforward model Merge is indeed more parsimonious than the interactive alternatives, and that it provides a more convincing account of the data than alternative models. Finally, we extend the arguments to deal with new issues raised by the commentators such as infant speech perception and neural architecture.
  • Norris, D., McQueen, J. M., & Cutler, A. (2000). Merging information in speech recognition: Feedback is never necessary. Behavioral and Brain Sciences, 23, 299-325.

    Abstract

    Top-down feedback does not benefit speech recognition; on the contrary, it can hinder it. No experimental data imply that feedback loops are required for speech recognition. Feedback is accordingly unnecessary and spoken word recognition is modular. To defend this thesis, we analyse lexical involvement in phonemic decision making. TRACE (McClelland & Elman 1986), a model with feedback from the lexicon to prelexical processes, is unable to account for all the available data on phonemic decision making. The modular Race model (Cutler & Norris 1979) is likewise challenged by some recent results, however. We therefore present a new modular model of phonemic decision making, the Merge model. In Merge, information flows from prelexical processes to the lexicon without feedback. Because phonemic decisions are based on the merging of prelexical and lexical information, Merge correctly predicts lexical involvement in phonemic decisions in both words and nonwords. Computer simulations show how Merge is able to account for the data through a process of competition between lexical hypotheses. We discuss the issue of feedback in other areas of language processing and conclude that modular models are particularly well suited to the problems and constraints of speech recognition.
  • Cutler, A., Van Ooijen, B., Norris, D., & Sanchez-Casas, R. (1996). Speeded detection of vowels: A cross-linguistic study. Perception and Psychophysics, 58, 807-822. Retrieved from http://www.psychonomic.org/search/view.cgi?id=430.

    Abstract

    In four experiments, listeners’ response times to detect vowel targets in spoken input were measured. The first three experiments were conducted in English. In two, one using real words and the other, nonwords, detection accuracy was low, targets in initial syllables were detected more slowly than targets in final syllables, and both response time and missed-response rate were inversely correlated with vowel duration. In a third experiment, the speech context for some subjects included all English vowels, while for others, only five relatively distinct vowels occurred. This manipulation had essentially no effect, and the same response pattern was again observed. A fourth experiment, conducted in Spanish, replicated the results in the first three experiments, except that miss rate was here unrelated to vowel duration. We propose that listeners’ responses to vowel targets in naturally spoken input are effectively cautious, reflecting realistic appreciation of vowel variability in natural context.
  • Otake, T., Yoneyama, K., Cutler, A., & van der Lugt, A. (1996). The representation of Japanese moraic nasals. Journal of the Acoustical Society of America, 100, 3831-3842. doi:10.1121/1.417239.

    Abstract

    Nasal consonants in syllabic coda position in Japanese assimilate to the place of articulation of a following consonant. The resulting forms may be perceived as different realizations of a single underlying unit, and indeed the kana orthographies represent them with a single character. In the present study, Japanese listeners' response time to detect nasal consonants was measured. Nasals in coda position, i.e., moraic nasals, were detected faster and more accurately than nonmoraic nasals, as reported in previous studies. The place of articulation with which moraic nasals were realized affected neither response time nor accuracy. Non-native subjects who knew no Japanese, given the same materials with the same instructions, simply failed to respond to moraic nasals which were realized bilabially. When the nasals were cross-spliced across place of articulation contexts the Japanese listeners still showed no significant place of articulation effects, although responses were faster and more accurate to unspliced than to cross-spliced nasals. When asked to detect the phoneme following the (cross-spliced) moraic nasal, Japanese listeners showed effects of mismatch between nasal and context, but non-native listeners did not. Together, these results suggest that Japanese listeners are capable of very rapid abstraction from phonetic realization to a unitary representation of moraic nasals; but they can also use the phonetic realization of a moraic nasal effectively to obtain anticipatory information about following phonemes.
  • Cutler, A., Mehler, J., Norris, D., & Segui, J. (1983). A language-specific comprehension strategy [Letters to Nature]. Nature, 304, 159-160. doi:10.1038/304159a0.

    Abstract

    Infants acquire whatever language is spoken in the environment into which they are born. The mental capability of the newborn child is not biased in any way towards the acquisition of one human language rather than another. Because psychologists who attempt to model the process of language comprehension are interested in the structure of the human mind, rather than in the properties of individual languages, strategies which they incorporate in their models are presumed to be universal, not language-specific. In other words, strategies of comprehension are presumed to be characteristic of the human language processing system, rather than, say, the French, English, or Igbo language processing systems. We report here, however, on a comprehension strategy which appears to be used by native speakers of French but not by native speakers of English.
  • Levelt, W. J. M., & Cutler, A. (1983). Prosodic marking in speech repair. Journal of semantics, 2, 205-217. doi:10.1093/semant/2.2.205.

    Abstract

    Spontaneous self-corrections in speech pose a communication problem; the speaker must make clear to the listener not only that the original Utterance was faulty, but where it was faulty and how the fault is to be corrected. Prosodic marking of corrections - making the prosody of the repair noticeably different from that of the original utterance - offers a resource which the speaker can exploit to provide the listener with such information. A corpus of more than 400 spontaneous speech repairs was analysed, and the prosodic characteristics compared with the syntactic and semantic characteristics of each repair. Prosodic marking showed no relationship at all with the syntactic characteristics of repairs. Instead, marking was associated with certain semantic factors: repairs were marked when the original utterance had been actually erroneous, rather than simply less appropriate than the repair; and repairs tended to be marked more often when the set of items encompassing the error and the repair was small rather than when it was large. These findings lend further weight to the characterization of accent as essentially semantic in function.

Share this page