Anne Cutler †

Publications

Displaying 1 - 40 of 40
  • Burnham, D., Ambikairajah, E., Arciuli, J., Bennamoun, M., Best, C. T., Bird, S., Butcher, A. R., Cassidy, S., Chetty, G., Cox, F. M., Cutler, A., Dale, R., Epps, J. R., Fletcher, J. M., Goecke, R., Grayden, D. B., Hajek, J. T., Ingram, J. C., Ishihara, S., Kemp, N. and 10 moreBurnham, D., Ambikairajah, E., Arciuli, J., Bennamoun, M., Best, C. T., Bird, S., Butcher, A. R., Cassidy, S., Chetty, G., Cox, F. M., Cutler, A., Dale, R., Epps, J. R., Fletcher, J. M., Goecke, R., Grayden, D. B., Hajek, J. T., Ingram, J. C., Ishihara, S., Kemp, N., Kinoshita, Y., Kuratate, T., Lewis, T. W., Loakes, D. E., Onslow, M., Powers, D. M., Rose, P., Togneri, R., Tran, D., & Wagner, M. (2009). A blueprint for a comprehensive Australian English auditory-visual speech corpus. In M. Haugh, K. Burridge, J. Mulder, & P. Peters (Eds.), Selected proceedings of the 2008 HCSNet Workshop on Designing the Australian National Corpus (pp. 96-107). Somerville, MA: Cascadilla Proceedings Project.

    Abstract

    Large auditory-visual (AV) speech corpora are the grist of modern research in speech science, but no such corpus exists for Australian English. This is unfortunate, for speech science is the brains behind speech technology and applications such as text-to-speech (TTS) synthesis, automatic speech recognition (ASR), speaker recognition and forensic identification, talking heads, and hearing prostheses. Advances in these research areas in Australia require a large corpus of Australian English. Here the authors describe a blueprint for building the Big Australian Speech Corpus (the Big ASC), a corpus of over 1,100 speakers from urban and rural Australia, including speakers of non-indigenous, indigenous, ethnocultural, and disordered forms of Australian English, each of whom would be sampled on three occasions in a range of speech tasks designed by the researchers who would be using the corpus.
  • Cutler, A., Davis, C., & Kim, J. (2009). Non-automaticity of use of orthographic knowledge in phoneme evaluation. In Proceedings of the 10th Annual Conference of the International Speech Communication Association (Interspeech 2009) (pp. 380-383). Causal Productions Pty Ltd.

    Abstract

    Two phoneme goodness rating experiments addressed the role of orthographic knowledge in the evaluation of speech sounds. Ratings for the best tokens of /s/ were higher in words spelled with S (e.g., bless) than in words where /s/ was spelled with C (e.g., voice). This difference did not appear for analogous nonwords for which every lexical neighbour had either S or C spelling (pless, floice). Models of phonemic processing incorporating obligatory influence of lexical information in phonemic processing cannot explain this dissociation; the data are consistent with models in which phonemic decisions are not subject to necessary top-down lexical influence.
  • Cutler, A. (2009). Psycholinguistics in our time. In P. Rabbitt (Ed.), Inside psychology: A science over 50 years (pp. 91-101). Oxford: Oxford University Press.
  • Braun, B., Lemhöfer, K., & Cutler, A. (2008). English word stress as produced by English and Dutch speakers: The role of segmental and suprasegmental differences. In Proceedings of Interspeech 2008 (pp. 1953-1953).

    Abstract

    It has been claimed that Dutch listeners use suprasegmental cues (duration, spectral tilt) more than English listeners in distinguishing English word stress. We tested whether this asymmetry also holds in production, comparing the realization of English word stress by native English speakers and Dutch speakers. Results confirmed that English speakers centralize unstressed vowels more, while Dutch speakers of English make more use of suprasegmental differences.
  • Braun, B., Tagliapietra, L., & Cutler, A. (2008). Contrastive utterances make alternatives salient: Cross-modal priming evidence. In Proceedings of Interspeech 2008 (pp. 69-69).

    Abstract

    Sentences with contrastive intonation are assumed to presuppose contextual alternatives to the accented elements. Two cross-modal priming experiments tested in Dutch whether such contextual alternatives are automatically available to listeners. Contrastive associates – but not non- contrastive associates - were facilitated only when primes were produced in sentences with contrastive intonation, indicating that contrastive intonation makes unmentioned contextual alternatives immediately available. Possibly, contrastive contours trigger a “presupposition resolution mechanism” by which these alternatives become salient.
  • Cutler, A., McQueen, J. M., Butterfield, S., & Norris, D. (2008). Prelexically-driven perceptual retuning of phoneme boundaries. In Proceedings of Interspeech 2008 (pp. 2056-2056).

    Abstract

    Listeners heard an ambiguous /f-s/ in nonword contexts where only one of /f/ or /s/ was legal (e.g., frul/*srul or *fnud/snud). In later categorisation of a phonetic continuum from /f/ to /s/, their category boundaries had shifted; hearing -rul led to expanded /f/ categories, -nud expanded /s/. Thus phonotactic sequence information alone induces perceptual retuning of phoneme category boundaries; lexical access is not required.
  • Kooijman, V., Johnson, E. K., & Cutler, A. (2008). Reflections on reflections of infant word recognition. In A. D. Friederici, & G. Thierry (Eds.), Early language development: Bridging brain and behaviour (pp. 91-114). Amsterdam: Benjamins.
  • Cutler, A., & Broersma, M. (2005). Phonetic precision in listening. In W. J. Hardcastle, & J. M. Beck (Eds.), A figure of speech: A Festschrift for John Laver (pp. 63-91). Mahwah, NJ: Erlbaum.
  • Cutler, A., Klein, W., & Levinson, S. C. (2005). The cornerstones of twenty-first century psycholinguistics. In A. Cutler (Ed.), Twenty-first century psycholinguistics: Four cornerstones (pp. 1-20). Mahwah, NJ: Erlbaum.
  • Cutler, A. (2005). The lexical statistics of word recognition problems caused by L2 phonetic confusion. In Proceedings of the 9th European Conference on Speech Communication and Technology (pp. 413-416).
  • Cutler, A., McQueen, J. M., & Norris, D. (2005). The lexical utility of phoneme-category plasticity. In Proceedings of the ISCA Workshop on Plasticity in Speech Perception (PSP2005) (pp. 103-107).
  • Cutler, A. (Ed.). (2005). Twenty-first century psycholinguistics: Four cornerstones. Mahwah, NJ: Erlbaum.
  • Cutler, A. (2005). Lexical stress. In D. B. Pisoni, & R. E. Remez (Eds.), The handbook of speech perception (pp. 264-289). Oxford: Blackwell.
  • Cutler, A. (Ed.). (2005). Twenty-first century psycholinguistics: Four cornerstones. Hillsdale, NJ: Erlbaum.
  • Goudbeek, M., Smits, R., Cutler, A., & Swingley, D. (2005). Acquiring auditory and phonetic categories. In H. Cohen, & C. Lefebvre (Eds.), Handbook of categorization in cognitive science (pp. 497-513). Amsterdam: Elsevier.
  • Cutler, A., & Chen, H.-C. (1995). Phonological similarity effects in Cantonese word recognition. In K. Elenius, & P. Branderud (Eds.), Proceedings of the Thirteenth International Congress of Phonetic Sciences: Vol. 1 (pp. 106-109). Stockholm: Stockholm University.

    Abstract

    Two lexical decision experiments in Cantonese are described in which the recognition of spoken target words as a function of phonological similarity to a preceding prime is investigated. Phonological similaritv in first syllables produced inhibition, while similarity in second syllables led to facilitation. Differences between syllables in tonal and segmental structure had generally similar effects.
  • Cutler, A. (1995). Spoken word recognition and production. In J. L. Miller, & P. D. Eimas (Eds.), Speech, language and communication (pp. 97-136). New York: Academic Press.

    Abstract

    This chapter highlights that most language behavior consists of speaking and listening. The chapter also reveals differences and similarities between speaking and listening. The laboratory study of word production raises formidable problems; ensuring that a particular word is produced may subvert the spontaneous production process. Word production is investigated via slips and tip-of-the-tongue (TOT), primarily via instances of processing failure and via the technique of via the picture-naming task. The methodology of word production is explained in the chapter. The chapter also explains the phenomenon of interaction between various stages of word production and the process of speech recognition. In this context, it explores the difference between sound and meaning and examines whether or not the comparisons are appropriate between the processes of recognition and production of spoken words. It also describes the similarities and differences in the structure of the recognition and production systems. Finally, the chapter highlights the common issues in recognition and production research, which include the nuances of frequency of occurrence, morphological structure, and phonological structure.
  • Cutler, A. (1995). Spoken-word recognition. In G. Bloothooft, V. Hazan, D. Hubert, & J. Llisterri (Eds.), European studies in phonetics and speech communication (pp. 66-71). Utrecht: OTS.
  • Cutler, A. (1995). The perception of rhythm in spoken and written language. In J. Mehler, & S. Franck (Eds.), Cognition on cognition (pp. 283-288). Cambridge, MA: MIT Press.
  • Cutler, A., & McQueen, J. M. (1995). The recognition of lexical units in speech. In B. De Gelder, & J. Morais (Eds.), Speech and reading: A comparative approach (pp. 33-47). Hove, UK: Erlbaum.
  • Cutler, A. (1995). Universal and Language-Specific in the Development of Speech. Biology International, (Special Issue 33).
  • Otake, T., Davis, S. M., & Cutler, A. (1995). Listeners’ representations of within-word structure: A cross-linguistic and cross-dialectal investigation. In J. Pardo (Ed.), Proceedings of EUROSPEECH 95: Vol. 3 (pp. 1703-1706). Madrid: European Speech Communication Association.

    Abstract

    Japanese, British English and American English listeners were presented with spoken words in their native language, and asked to mark on a written transcript of each word the first natural division point in the word. The results showed clear and strong patterns of consensus, indicating that listeners have available to them conscious representations of within-word structure. Orthography did not play a strongly deciding role in the results. The patterns of response were at variance with results from on-line studies of speech segmentation, suggesting that the present task taps not those representations used in on-line listening, but levels of representation which may involve much richer knowledge of word-internal structure.
  • Allerhand, M., Butterfield, S., Cutler, A., & Patterson, R. (1992). Assessing syllable strength via an auditory model. In Proceedings of the Institute of Acoustics: Vol. 14 Part 6 (pp. 297-304). St. Albans, Herts: Institute of Acoustics.
  • Cutler, A., Kearns, R., Norris, D., & Scott, D. (1992). Listeners’ responses to extraneous signals coincident with English and French speech. In J. Pittam (Ed.), Proceedings of the 4th Australian International Conference on Speech Science and Technology (pp. 666-671). Canberra: Australian Speech Science and Technology Association.

    Abstract

    English and French listeners performed two tasks - click location and speeded click detection - with both English and French sentences, closely matched for syntactic and phonological structure. Clicks were located more accurately in open- than in closed-class words in both English and French; they were detected more rapidly in open- than in closed-class words in English, but not in French. The two listener groups produced the same pattern of responses, suggesting that higher-level linguistic processing was not involved in these tasks.
  • Cutler, A. (1992). Processing constraints of the native phonological repertoire on the native language. In Y. Tohkura, E. Vatikiotis-Bateson, & Y. Sagisaka (Eds.), Speech perception, production and linguistic structure (pp. 275-278). Tokyo: Ohmsha.
  • Cutler, A. (1992). Psychology and the segment. In G. Docherty, & D. Ladd (Eds.), Papers in laboratory phonology II: Gesture, segment, prosody (pp. 290-295). Cambridge: Cambridge University Press.
  • Cutler, A., & Robinson, T. (1992). Response time as a metric for comparison of speech recognition by humans and machines. In J. Ohala, T. Neary, & B. Derwing (Eds.), Proceedings of the Second International Conference on Spoken Language Processing: Vol. 1 (pp. 189-192). Alberta: University of Alberta.

    Abstract

    The performance of automatic speech recognition systems is usually assessed in terms of error rate. Human speech recognition produces few errors, but relative difficulty of processing can be assessed via response time techniques. We report the construction of a measure analogous to response time in a machine recognition system. This measure may be compared directly with human response times. We conducted a trial comparison of this type at the phoneme level, including both tense and lax vowels and a variety of consonant classes. The results suggested similarities between human and machine processing in the case of consonants, but differences in the case of vowels.
  • Cutler, A. (1992). The perception of speech: Psycholinguistic aspects. In W. Bright (Ed.), International encyclopedia of language: Vol. 3 (pp. 181-183). New York: Oxford University Press.
  • Cutler, A. (1992). The production and perception of word boundaries. In Y. Tohkura, E. Vatikiotis-Bateson, & Y. Sagisaka (Eds.), Speech perception, production and linguistic structure (pp. 419-425). Tokyo: Ohsma.
  • Cutler, A. (1992). Why not abolish psycholinguistics? In W. Dressler, H. Luschützky, O. Pfeiffer, & J. Rennison (Eds.), Phonologica 1988 (pp. 77-87). Cambridge: Cambridge University Press.
  • McQueen, J. M., & Cutler, A. (1992). Words within words: Lexical statistics and lexical access. In J. Ohala, T. Neary, & B. Derwing (Eds.), Proceedings of the Second International Conference on Spoken Language Processing: Vol. 1 (pp. 221-224). Alberta: University of Alberta.

    Abstract

    This paper presents lexical statistics on the pattern of occurrence of words embedded in other words. We report the results of an analysis of 25000 words, varying in length from two to six syllables, extracted from a phonetically-coded English dictionary (The Longman Dictionary of Contemporary English). Each syllable, and each string of syllables within each word was checked against the dictionary. Two analyses are presented: the first used a complete list of polysyllables, with look-up on the entire dictionary; the second used a sublist of content words, counting only embedded words which were themselves content words. The results have important implications for models of human speech recognition. The efficiency of these models depends, in different ways, on the number and location of words within words.
  • Norris, D., Van Ooijen, B., & Cutler, A. (1992). Speeded detection of vowels and steady-state consonants. In J. Ohala, T. Neary, & B. Derwing (Eds.), Proceedings of the Second International Conference on Spoken Language Processing; Vol. 2 (pp. 1055-1058). Alberta: University of Alberta.

    Abstract

    We report two experiments in which vowels and steady-state consonants served as targets in a speeded detection task. In the first experiment, two vowels were compared with one voiced and once unvoiced fricative. Response times (RTs) to the vowels were longer than to the fricatives. The error rate was higher for the consonants. Consonants in word-final position produced the shortest RTs, For the vowels, RT correlated negatively with target duration. In the second experiment, the same two vowel targets were compared with two nasals. This time there was no significant difference in RTs, but the error rate was still significantly higher for the consonants. Error rate and length correlated negatively for the vowels only. We conclude that RT differences between phonemes are independent of vocalic or consonantal status. Instead, we argue that the process of phoneme detection reflects more finely grained differences in acoustic/articulatory structure within the phonemic repertoire.
  • Cutler, A. (1987). Components of prosodic effects in speech recognition. In Proceedings of the Eleventh International Congress of Phonetic Sciences: Vol. 1 (pp. 84-87). Tallinn: Academy of Sciences of the Estonian SSR, Institute of Language and Literature.

    Abstract

    Previous research has shown that listeners use the prosodic structure of utterances in a predictive fashion in sentence comprehension, to direct attention to accented words. Acoustically identical words spliced into sentence contexts arc responded to differently if the prosodic structure of the context is \ aricd: when the preceding prosody indicates that the word will he accented, responses are faster than when the preceding prosodv is inconsistent with accent occurring on that word. In the present series of experiments speech hybridisation techniques were first used to interchange the timing patterns within pairs of prosodic variants of utterances, independently of the pitch and intensity contours. The time-adjusted utterances could then serve as a basis lor the orthogonal manipulation of the three prosodic dimensions of pilch, intensity and rhythm. The overall pattern of results showed that when listeners use prosody to predict accent location, they do not simply rely on a single prosodic dimension, hut exploit the interaction between pitch, intensity and rhythm.
  • Cutler, A. (1987). Speaking for listening. In A. Allport, D. MacKay, W. Prinz, & E. Scheerer (Eds.), Language perception and production: Relationships between listening, speaking, reading and writing (pp. 23-40). London: Academic Press.

    Abstract

    Speech production is constrained at all levels by the demands of speech perception. The speaker's primary aim is successful communication, and to this end semantic, syntactic and lexical choices are directed by the needs of the listener. Even at the articulatory level, some aspects of production appear to be perceptually constrained, for example the blocking of phonological distortions under certain conditions. An apparent exception to this pattern is word boundary information, which ought to be extremely useful to listeners, but which is not reliably coded in speech. It is argued that the solution to this apparent problem lies in rethinking the concept of the boundary of the lexical access unit. Speech rhythm provides clear information about the location of stressed syllables, and listeners do make use of this information. If stressed syllables can serve as the determinants of word lexical access codes, then once again speakers are providing precisely the necessary form of speech information to facilitate perception.
  • Cutler, A., & Carter, D. (1987). The prosodic structure of initial syllables in English. In J. Laver, & M. Jack (Eds.), Proceedings of the European Conference on Speech Technology: Vol. 1 (pp. 207-210). Edinburgh: IEE.
  • Cutler, A. (1983). Lexical complexity and sentence processing. In G. B. Flores d'Arcais, & R. J. Jarvella (Eds.), The process of language understanding (pp. 43-79). Chichester, Sussex: Wiley.
  • Cutler, A., & Ladd, D. R. (Eds.). (1983). Prosody: Models and measurements. Heidelberg: Springer.
  • Cutler, A. (1983). Semantics, syntax and sentence accent. In M. Van den Broecke, & A. Cohen (Eds.), Proceedings of the Tenth International Congress of Phonetic Sciences (pp. 85-91). Dordrecht: Foris.
  • Cutler, A. (1983). Speakers’ conceptions of the functions of prosody. In A. Cutler, & D. R. Ladd (Eds.), Prosody: Models and measurements (pp. 79-91). Heidelberg: Springer.
  • Ladd, D. R., & Cutler, A. (1983). Models and measurements in the study of prosody. In A. Cutler, & D. R. Ladd (Eds.), Prosody: Models and measurements (pp. 1-10). Heidelberg: Springer.

Share this page