Anne Cutler

Publications

Displaying 1 - 29 of 29
  • Bruggeman, L., & Cutler, A. (2016). Lexical manipulation as a discovery tool for psycholinguistic research. In C. Carignan, & M. D. Tyler (Eds.), Proceedings of the 16th Australasian International Conference on Speech Science and Technology (SST2016) (pp. 313-316).
  • Cutler, A., & Norris, D. (2016). Bottoms up! How top-down pitfalls ensnare speech perception researchers too. Commentary on C. Firestone & B. Scholl: Cognition does not affect perception: Evaluating the evidence for 'top-down' effects. Behavioral and Brain Sciences, e236. doi:10.1017/S0140525X15002745.

    Abstract

    Not only can the pitfalls that Firestone & Scholl (F&S) identify be generalised across multiple studies within the field of visual perception, but also they have general application outside the field wherever perceptual and cognitive processing are compared. We call attention to the widespread susceptibility of research on the perception of speech to versions of the same pitfalls.
  • Ip, M., & Cutler, A. (2016). Cross-language data on five types of prosodic focus. In J. Barnes, A. Brugos, S. Shattuck-Hufnagel, & N. Veilleux (Eds.), Proceedings of Speech Prosody 2016 (pp. 330-334).

    Abstract

    To examine the relative roles of language-specific and language-universal mechanisms in the production of prosodic focus, we compared production of five different types of focus by native speakers of English and Mandarin. Two comparable dialogues were constructed for each language, with the same words appearing in focused and unfocused position; 24 speakers recorded each dialogue in each language. Duration, F0 (mean, maximum, range), and rms-intensity (mean, maximum) of all critical word tokens were measured. Across the different types of focus, cross-language differences were observed in the degree to which English versus Mandarin speakers use the different prosodic parameters to mark focus, suggesting that while prosody may be universally available for expressing focus, the means of its employment may be considerably language-specific
  • Jeske, J., Kember, H., & Cutler, A. (2016). Native and non-native English speakers' use of prosody to predict sentence endings. In Proceedings of the 16th Australasian International Conference on Speech Science and Technology (SST2016).
  • Kember, H., Choi, J., & Cutler, A. (2016). Processing advantages for focused words in Korean. In J. Barnes, A. Brugos, S. Shattuck-Hufnagel, & N. Veilleux (Eds.), Proceedings of Speech Prosody 2016 (pp. 702-705).

    Abstract

    In Korean, focus is expressed in accentual phrasing. To ascertain whether words focused in this manner enjoy a processing advantage analogous to that conferred by focus as expressed in, e.g, English and Dutch, we devised sentences with target words in one of four conditions: prosodic focus, syntactic focus, prosodic + syntactic focus, and no focus as a control. 32 native speakers of Korean listened to blocks of 10 sentences, then were presented visually with words and asked whether or not they had heard them. Overall, words with focus were recognised significantly faster and more accurately than unfocused words. In addition, words with syntactic focus or syntactic + prosodic focus were recognised faster than words with prosodic focus alone. As for other languages, Korean focus confers processing advantage on the words carrying it. While prosodic focus does provide an advantage, however, syntactic focus appears to provide the greater beneficial effect for recognition memory
  • Norris, D., McQueen, J. M., & Cutler, A. (2016). Prediction, Bayesian inference and feedback in speech recognition. Language, Cognition and Neuroscience, 31(1), 4-18. doi:10.1080/23273798.2015.1081703.

    Abstract

    Speech perception involves prediction, but how is that prediction implemented? In cognitive models prediction has often been taken to imply that there is feedback of activation from lexical to pre-lexical processes as implemented in interactive-activation models (IAMs). We show that simple activation feedback does not actually improve speech recognition. However, other forms of feedback can be beneficial. In particular, feedback can enable the listener to adapt to changing input, and can potentially help the listener to recognise unusual input, or recognise speech in the presence of competing sounds. The common feature of these helpful forms of feedback is that they are all ways of optimising the performance of speech recognition using Bayesian inference. That is, listeners make predictions about speech because speech recognition is optimal in the sense captured in Bayesian models.
  • Broersma, M., & Cutler, A. (2011). Competition dynamics of second-language listening. Quarterly Journal of Experimental Psychology, 64, 74-95. doi:10.1080/17470218.2010.499174.

    Abstract

    Spoken-word recognition in a nonnative language is particularly difficult where it depends on discrimination between confusable phonemes. Four experiments here examine whether this difficulty is in part due to phantom competition from “near-words” in speech. Dutch listeners confuse English /aelig/ and /ε/, which could lead to the sequence daf being interpreted as deaf, or lemp being interpreted as lamp. In auditory lexical decision, Dutch listeners indeed accepted such near-words as real English words more often than English listeners did. In cross-modal priming, near-words extracted from word or phrase contexts (daf from DAFfodil, lemp from eviL EMPire) induced activation of corresponding real words (deaf; lamp) for Dutch, but again not for English, listeners. Finally, by the end of untruncated carrier words containing embedded words or near-words (definite; daffodil) no activation of the real embedded forms (deaf in definite) remained for English or Dutch listeners, but activation of embedded near-words (deaf in daffodil) did still remain, for Dutch listeners only. Misinterpretation of the initial vowel here favoured the phantom competitor and disfavoured the carrier (lexically represented as containing a different vowel). Thus, near-words compete for recognition and continue competing for longer than actually embedded words; nonnative listening indeed involves phantom competition.
  • Cutler, A., Andics, A., & Fang, Z. (2011). Inter-dependent categorization of voices and segments. In W.-S. Lee, & E. Zee (Eds.), Proceedings of the 17th International Congress of Phonetic Sciences [ICPhS 2011] (pp. 552-555). Hong Kong: Department of Chinese, Translation and Linguistics, City University of Hong Kong.

    Abstract

    Listeners performed speeded two-alternative choice between two unfamiliar and relatively similar voices or between two phonetically close segments, in VC syllables. For each decision type (segment, voice), the non-target dimension (voice, segment) either was constant, or varied across four alternatives. Responses were always slower when a non-target dimension varied than when it did not, but the effect of phonetic variation on voice identity decision was stronger than that of voice variation on phonetic identity decision. Cues to voice and segment identity in speech are processed inter-dependently, but hard categorization decisions about voices draw on, and are hence sensitive to, segmental information.
  • Cutler, A. (2011). Listening to REAL second language. AATSEEL Newsletter, 54(3), 14.
  • Johnson, E. K., Westrek, E., Nazzi, T., & Cutler, A. (2011). Infant ability to tell voices apart rests on language experience. Developmental Science, 14(5), 1002-1011. doi:10.1111/j.1467-7687.2011.01052.x.

    Abstract

    A visual fixation study tested whether seven-month-olds can discriminate between different talkers. The infants were first habituated to talkers producing sentences in either a familiar or unfamiliar language, then heard test sentences from previously unheard speakers, either in the language used for habituation, or in another language. When the language at test mismatched that in habituation, infants always noticed the change. When language remained constant and only talker altered, however, infants detected the change only if the language was the native tongue. Adult listeners with a different native tongue than the infants did not reproduce the discriminability patterns shown by the infants, and infants detected neither voice nor language changes in reversed speech; both these results argue against explanation of the native-language voice discrimination in terms of acoustic properties of the stimuli. The ability to identify talkers is, like many other perceptual abilities, strongly influenced by early life experience.
  • Tuinman, A., & Cutler, A. (2011). L1 knowledge and the perception of casual speech processes in L2. In M. Wrembel, M. Kul, & K. Dziubalska-Kolaczyk (Eds.), Achievements and perspectives in SLA of speech: New Sounds 2010. Volume I (pp. 289-301). Frankfurt am Main: Peter Lang.

    Abstract

    Every language manifests casual speech processes, and hence every second language too. This study examined how listeners deal with second-language casual speech processes, as a function of the processes in their native language. We compared a match case, where a second-language process t/-reduction) is also operative in native speech, with a mismatch case, where a second-language process (/r/-insertion) is absent from native speech. In each case native and non-native listeners judged stimuli in which a given phoneme (in sentence context) varied along a continuum from absent to present. Second-language listeners in general mimicked native performance in the match case, but deviated significantly from native performance in the mismatch case. Together these results make it clear that the mapping from first to second language is as important in the interpretation of casual speech processes as in other dimensions of speech perception. Unfamiliar casual speech processes are difficult to adapt to in a second language. Casual speech processes that are already familiar from native speech, however, are easy to adapt to; indeed, our results even suggest that it is possible for subtle difference in their occurrence patterns across the two languages to be detected,and to be accommodated to in second-language listening
  • Tuinman, A., Mitterer, H., & Cutler, A. (2011). Perception of intrusive /r/ in English by native, cross-language and cross-dialect listeners. Journal of the Acoustical Society of America, 130, 1643-1652. doi:10.1121/1.3619793.

    Abstract

    In sequences such as law and order, speakers of British English often insert /r/ between law and and. Acoustic analyses revealed such “intrusive” /r/ to be significantly shorter than canonical /r/. In a 2AFC experiment, native listeners heard British English sentences in which /r/ duration was manipulated across a word boundary [e.g., saw (r)ice], and orthographic and semantic factors were varied. These listeners responded categorically on the basis of acoustic evidence for /r/ alone, reporting ice after short /r/s, rice after long /r/s; orthographic and semantic factors had no effect. Dutch listeners proficient in English who heard the same materials relied less on durational cues than the native listeners, and were affected by both orthography and semantic bias. American English listeners produced intermediate responses to the same materials, being sensitive to duration (less so than native, more so than Dutch listeners), and to orthography (less so than the Dutch), but insensitive to the semantic manipulation. Listeners from language communities without common use of intrusive /r/ may thus interpret intrusive /r/ as canonical /r/, with a language difference increasing this propensity more than a dialect difference. Native listeners, however, efficiently distinguish intrusive from canonical /r/ by exploiting the relevant acoustic variation.
  • Tuinman, A., Mitterer, H., & Cutler, A. (2011). The efficiency of cross-dialectal word recognition. In Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence, Italy (pp. 153-156).

    Abstract

    Dialects of the same language can differ in the casual speech processes they allow; e.g., British English allows the insertion of [r] at word boundaries in sequences such as saw ice, while American English does not. In two speeded word recognition experiments, American listeners heard such British English sequences; in contrast to non-native listeners, they accurately perceived intended vowel-initial words even with intrusive [r]. Thus despite input mismatches, cross-dialectal word recognition benefits from the full power of native-language processing.
  • Wagner, M., Tran, D., Togneri, R., Rose, P., Powers, D., Onslow, M., Loakes, D., Lewis, T., Kuratate, T., Kinoshita, Y., Kemp, N., Ishihara, S., Ingram, J., Hajek, J., Grayden, D., Göcke, R., Fletcher, J., Estival, D., Epps, J., Dale, R. and 11 moreWagner, M., Tran, D., Togneri, R., Rose, P., Powers, D., Onslow, M., Loakes, D., Lewis, T., Kuratate, T., Kinoshita, Y., Kemp, N., Ishihara, S., Ingram, J., Hajek, J., Grayden, D., Göcke, R., Fletcher, J., Estival, D., Epps, J., Dale, R., Cutler, A., Cox, F., Chetty, G., Cassidy, S., Butcher, A., Burnham, D., Bird, S., Best, C., Bennamoun, M., Arciuli, J., & Ambikairajah, E. (2011). The Big Australian Speech Corpus (The Big ASC). In M. Tabain, J. Fletcher, D. Grayden, J. Hajek, & A. Butcher (Eds.), Proceedings of the Thirteenth Australasian International Conference on Speech Science and Technology (pp. 166-170). Melbourne: ASSTA.
  • Boland, J. E., & Cutler, A. (1995). Interaction with autonomy: Defining multiple output models in psycholinguistic theory. Working Papers in Linguistic, 45, 1-10. Retrieved from http://hdl.handle.net/2066/15768.

    Abstract

    There are currently a number of psycholinguistic models in which processing at a particular level of representation is characterized by the generation of multiple outputs, with resolution involving the use of information from higher levels of processing. Surprisingly, models with this architecture have been characterized as autonomous within the domain of word recognition and as interactive within the domain of sentence processing. We suggest that the apparent internal confusion is not, as might be assumed, due to fundamental differences between lexical and syntactic processing. Rather, we believe that the labels in each domain were chosen in order to obtain maximal contrast between a new model and the model or models that were currently dominating the field.
  • Boland, J. E., & Cutler, A. (1995). Interaction with autonomy: Multiple Output models and the inadequacy of the Great Divide. Cognition, 58, 309-320. doi:10.1016/0010-0277(95)00684-2.

    Abstract

    There are currently a number of psycholinguistic models in which processing at a particular level of representation is characterized by the generation of multiple outputs, with resolution - but not generation - involving the use of information from higher levels of processing. Surprisingly, models with this architecture have been characterized as autonomous within the domain of word recognition but as interactive within the domain of sentence processing. We suggest that the apparent confusion is not, as might be assumed, due to fundamental differences between lexical and syntactic processing. Rather, we believe that the labels in each domain were chosen in order to obtain maximal contrast between a new model and the model or models that were currently dominating the field. The contradiction serves to highlight the inadequacy of a simple autonomy/interaction dichotomy for characterizing the architectures of current processing models.
  • Cutler, A., & Chen, H.-C. (1995). Phonological similarity effects in Cantonese word recognition. In K. Elenius, & P. Branderud (Eds.), Proceedings of the Thirteenth International Congress of Phonetic Sciences: Vol. 1 (pp. 106-109). Stockholm: Stockholm University.

    Abstract

    Two lexical decision experiments in Cantonese are described in which the recognition of spoken target words as a function of phonological similarity to a preceding prime is investigated. Phonological similaritv in first syllables produced inhibition, while similarity in second syllables led to facilitation. Differences between syllables in tonal and segmental structure had generally similar effects.
  • Cutler, A. (1995). Spoken word recognition and production. In J. L. Miller, & P. D. Eimas (Eds.), Speech, language and communication (pp. 97-136). New York: Academic Press.

    Abstract

    This chapter highlights that most language behavior consists of speaking and listening. The chapter also reveals differences and similarities between speaking and listening. The laboratory study of word production raises formidable problems; ensuring that a particular word is produced may subvert the spontaneous production process. Word production is investigated via slips and tip-of-the-tongue (TOT), primarily via instances of processing failure and via the technique of via the picture-naming task. The methodology of word production is explained in the chapter. The chapter also explains the phenomenon of interaction between various stages of word production and the process of speech recognition. In this context, it explores the difference between sound and meaning and examines whether or not the comparisons are appropriate between the processes of recognition and production of spoken words. It also describes the similarities and differences in the structure of the recognition and production systems. Finally, the chapter highlights the common issues in recognition and production research, which include the nuances of frequency of occurrence, morphological structure, and phonological structure.
  • Cutler, A. (1995). Spoken-word recognition. In G. Bloothooft, V. Hazan, D. Hubert, & J. Llisterri (Eds.), European studies in phonetics and speech communication (pp. 66-71). Utrecht: OTS.
  • Cutler, A., & McQueen, J. M. (1995). The recognition of lexical units in speech. In B. De Gelder, & J. Morais (Eds.), Speech and reading: A comparative approach (pp. 33-47). Hove, UK: Erlbaum.
  • Cutler, A. (1995). The perception of rhythm in spoken and written language. In J. Mehler, & S. Franck (Eds.), Cognition on cognition (pp. 283-288). Cambridge, MA: MIT Press.
  • Cutler, A. (1995). Universal and Language-Specific in the Development of Speech. Biology International, (Special Issue 33).

    Additional information

    http://www.iubs.org/?id=34
  • Fear, B. D., Cutler, A., & Butterfield, S. (1995). The strong/weak syllable distinction in English. Journal of the Acoustical Society of America, 97, 1893-1904. doi:10.1121/1.412063.

    Abstract

    Strong and weak syllables in English can be distinguished on the basis of vowel quality, of stress, or of both factors. Critical for deciding between these factors are syllables containing unstressed unreduced vowels, such as the first syllable of automata. In this study 12 speakers produced sentences containing matched sets of words with initial vowels ranging from stressed to reduced, at normal and at fast speech rates. Measurements of the duration, intensity, F0, and spectral characteristics of the word-initial vowels showed that unstressed unreduced vowels differed significantly from both stressed and reduced vowels. This result held true across speaker sex and dialect. The vowels produced by one speaker were then cross-spliced across the words within each set, and the resulting words' acceptability was rated by listeners. In general, cross-spliced words were only rated significantly less acceptable than unspliced words when reduced vowels interchanged with any other vowel. Correlations between rated acceptability and acoustic characteristics of the cross-spliced words demonstrated that listeners were attending to duration, intensity, and spectral characteristics. Together these results suggest that unstressed unreduced vowels in English pattern differently from both stressed and reduced vowels, so that no acoustic support for a binary categorical distinction exists; nevertheless, listeners make such a distinction, grouping unstressed unreduced vowels by preference with stressed vowels
  • McQueen, J. M., Cutler, A., Briscoe, T., & Norris, D. (1995). Models of continuous speech recognition and the contents of the vocabulary. Language and Cognitive Processes, 10, 309-331. doi:10.1080/01690969508407098.

    Abstract

    Several models of spoken word recognition postulate that recognition is achieved via a process of competition between lexical hypotheses. Competition not only provides a mechanism for isolated word recognition, it also assists in continuous speech recognition, since it offers a means of segmenting continuous input into individual words. We present statistics on the pattern of occurrence of words embedded in the polysyllabic words of the English vocabulary, showing that an overwhelming majority (84%) of polysyllables have shorter words embedded within them. Positional analyses show that these embeddings are most common at the onsets of the longer word. Although both phonological and syntactic constraints could rule out some embedded words, they do not remove the problem. Lexical competition provides a means of dealing with lexical embedding. It is also supported by a growing body of experimental evidence. We present results which indicate that competition operates both between word candidates that begin at the same point in the input and candidates that begin at different points (McQueen, Norris, & Cutler, 1994, Noms, McQueen, & Cutler, in press). We conclude that lexical competition is an essential component in models of continuous speech recognition.
  • Norris, D., McQueen, J. M., & Cutler, A. (1995). Competition and segmentation in spoken word recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 1209-1228.

    Abstract

    Spoken utterances contain few reliable cues to word boundaries, but listeners nonetheless experience little difficulty identifying words in continuous speech. The authors present data and simulations that suggest that this ability is best accounted for by a model of spoken-word recognition combining competition between alternative lexical candidates and sensitivity to prosodic structure. In a word-spotting experiment, stress pattern effects emerged most clearly when there were many competing lexical candidates for part of the input. Thus, competition between simultaneously active word candidates can modulate the size of prosodic effects, which suggests that spoken-word recognition must be sensitive both to prosodic structure and to the effects of competition. A version of the Shortlist model ( D. G. Norris, 1994b) incorporating the Metrical Segmentation Strategy ( A. Cutler & D. Norris, 1988) accurately simulates the results using a lexicon of more than 25,000 words.
  • Otake, T., Davis, S. M., & Cutler, A. (1995). Listeners’ representations of within-word structure: A cross-linguistic and cross-dialectal investigation. In J. Pardo (Ed.), Proceedings of EUROSPEECH 95: Vol. 3 (pp. 1703-1706). Madrid: European Speech Communication Association.

    Abstract

    Japanese, British English and American English listeners were presented with spoken words in their native language, and asked to mark on a written transcript of each word the first natural division point in the word. The results showed clear and strong patterns of consensus, indicating that listeners have available to them conscious representations of within-word structure. Orthography did not play a strongly deciding role in the results. The patterns of response were at variance with results from on-line studies of speech segmentation, suggesting that the present task taps not those representations used in on-line listening, but levels of representation which may involve much richer knowledge of word-internal structure.
  • Cutler, A., & Fay, D. A. (Eds.). (1978). [Annotated re-issue of R. Meringer and C. Mayer: Versprechen und Verlesen, 1895]. Amsterdam: John Benjamins.
  • Cutler, A., & Fay, D. (1978). Introduction. In A. Cutler, & D. Fay (Eds.), [Annotated re-issue of R. Meringer and C. Mayer: Versprechen und Verlesen, 1895] (pp. ix-xl). Amsterdam: John Benjamins.
  • Cutler, A., & Cooper, W. E. (1978). Phoneme-monitoring in the context of different phonetic sequences. Journal of Phonetics, 6, 221-225.

    Abstract

    The order of some conjoined words is rigidly fixed (e.g. dribs and drabs/*drabs and dribs). Both phonetic and semantic factors can play a role in determining the fixed order. An experiment was conducted to test whether listerners’ reaction times for monitoring a predetermined phoneme are influenced by phonetic constraints on ordering. Two such constraints were investigated: monosyllable-bissyllable and high-low vowel sequences. In English, conjoined words occur in such sequences with much greater frequency than their converses, other factors being equal. Reaction times were significantly shorter for phoneme monitoring in monosyllable-bisyllable sequences than in bisyllable- monosyllable sequences. However, reaction times were not significantly different for high-low vs. low-high vowel sequences.

Share this page