Displaying 1 - 13 of 13
-
Asano, Y., Yuan, C., Grohe, A.-K., Weber, A., Antoniou, M., & Cutler, A. (2020). Uptalk interpretation as a function of listening experience. In N. Minematsu, M. Kondo, T. Arai, & R. Hayashi (
Eds. ), Proceedings of Speech Prosody 2020 (pp. 735-739). Tokyo: ISCA. doi:10.21437/SpeechProsody.2020-150.Abstract
The term “uptalk” describes utterance-final pitch rises that carry no sentence-structural information. Uptalk is usually dialectal or sociolectal, and Australian English (AusEng) is particularly known for this attribute. We ask here whether experience with an uptalk variety affects listeners’ ability to categorise rising pitch contours on the basis of the timing and height of their onset and offset. Listeners were two groups of English-speakers (AusEng, and American English), and three groups of listeners with L2 English: one group with Mandarin as L1 and experience of listening to AusEng, one with German as L1 and experience of listening to AusEng, and one with German as L1 but no AusEng experience. They heard nouns (e.g. flower, piano) in the framework “Got a NOUN”, each ending with a pitch rise artificially manipulated on three contrasts: low vs. high rise onset, low vs. high rise offset and early vs. late rise onset. Their task was to categorise the tokens as “question” or “statement”, and we analysed the effect of the pitch contrasts on their judgements. Only the native AusEng listeners were able to use the pitch contrasts systematically in making these categorisations. -
Yu, J., Mailhammer, R., & Cutler, A. (2020). Vocabulary structure affects word recognition: Evidence from German listeners. In N. Minematsu, M. Kondo, T. Arai, & R. Hayashi (
Eds. ), Proceedings of Speech Prosody 2020 (pp. 474-478). Tokyo: ISCA. doi:10.21437/SpeechProsody.2020-97.Abstract
Lexical stress is realised similarly in English, German, and
Dutch. On a suprasegmental level, stressed syllables tend to be
longer and more acoustically salient than unstressed syllables;
segmentally, vowels in unstressed syllables are often reduced.
The frequency of unreduced unstressed syllables (where only
the suprasegmental cues indicate lack of stress) however,
differs across the languages. The present studies test whether
listener behaviour is affected by these vocabulary differences,
by investigating German listeners’ use of suprasegmental cues
to lexical stress in German and English word recognition. In a
forced-choice identification task, German listeners correctly
assigned single-syllable fragments (e.g., Kon-) to one of two
words differing in stress (KONto, konZEPT). Thus, German
listeners can exploit suprasegmental information for
identifying words. German listeners also performed above
chance in a similar task in English (with, e.g., DIver, diVERT),
i.e., their sensitivity to these cues also transferred to a nonnative
language. An English listener group, in contrast, failed
in the English fragment task. These findings mirror vocabulary
patterns: German has more words with unreduced unstressed
syllables than English does. -
Ip, M. H. K., & Cutler, A. (2018). Asymmetric efficiency of juncture perception in L1 and L2. In K. Klessa, J. Bachan, A. Wagner, M. Karpiński, & D. Śledziński (
Eds. ), Proceedings of Speech Prosody 2018 (pp. 289-296). Baixas, France: ISCA. doi:10.21437/SpeechProsody.2018-59.Abstract
In two experiments, Mandarin listeners resolved potential syntactic ambiguities in spoken utterances in (a) their native language (L1) and (b) English which they had learned as a second language (L2). A new disambiguation task was used, requiring speeded responses to select the correct meaning for structurally ambiguous sentences. Importantly, the ambiguities used in the study are identical in Mandarin and in English, and production data show that prosodic disambiguation of this type of ambiguity is also realised very similarly in the two languages. The perceptual results here showed however that listeners’ response patterns differed for L1 and L2, although there was a significant increase in similarity between the two response patterns with increasing exposure to the L2. Thus identical ambiguity and comparable disambiguation patterns in L1 and L2 do not lead to immediate application of the appropriate L1 listening strategy to L2; instead, it appears that such a strategy may have to be learned anew for the L2. -
Ip, M. H. K., & Cutler, A. (2018). Cue equivalence in prosodic entrainment for focus detection. In J. Epps, J. Wolfe, J. Smith, & C. Jones (
Eds. ), Proceedings of the 17th Australasian International Conference on Speech Science and Technology (pp. 153-156).Abstract
Using a phoneme detection task, the present series of
experiments examines whether listeners can entrain to
different combinations of prosodic cues to predict where focus
will fall in an utterance. The stimuli were recorded by four
female native speakers of Australian English who happened to
have used different prosodic cues to produce sentences with
prosodic focus: a combination of duration cues, mean and
maximum F0, F0 range, and longer pre-target interval before
the focused word onset, only mean F0 cues, only pre-target
interval, and only duration cues. Results revealed that listeners
can entrain in almost every condition except for where
duration was the only reliable cue. Our findings suggest that
listeners are flexible in the cues they use for focus processing. -
Cutler, A., Burchfield, L. A., & Antoniou, M. (2018). Factors affecting talker adaptation in a second language. In J. Epps, J. Wolfe, J. Smith, & C. Jones (
Eds. ), Proceedings of the 17th Australasian International Conference on Speech Science and Technology (pp. 33-36).Abstract
Listeners adapt rapidly to previously unheard talkers by
adjusting phoneme categories using lexical knowledge, in a
process termed lexically-guided perceptual learning. Although
this is firmly established for listening in the native language
(L1), perceptual flexibility in second languages (L2) is as yet
less well understood. We report two experiments examining L1
and L2 perceptual learning, the first in Mandarin-English late
bilinguals, the second in Australian learners of Mandarin. Both
studies showed stronger learning in L1; in L2, however,
learning appeared for the English-L1 group but not for the
Mandarin-L1 group. Phonological mapping differences from
the L1 to the L2 are suggested as the reason for this result. -
Burnham, D., Ambikairajah, E., Arciuli, J., Bennamoun, M., Best, C. T., Bird, S., Butcher, A. R., Cassidy, S., Chetty, G., Cox, F. M., Cutler, A., Dale, R., Epps, J. R., Fletcher, J. M., Goecke, R., Grayden, D. B., Hajek, J. T., Ingram, J. C., Ishihara, S., Kemp, N. and 10 moreBurnham, D., Ambikairajah, E., Arciuli, J., Bennamoun, M., Best, C. T., Bird, S., Butcher, A. R., Cassidy, S., Chetty, G., Cox, F. M., Cutler, A., Dale, R., Epps, J. R., Fletcher, J. M., Goecke, R., Grayden, D. B., Hajek, J. T., Ingram, J. C., Ishihara, S., Kemp, N., Kinoshita, Y., Kuratate, T., Lewis, T. W., Loakes, D. E., Onslow, M., Powers, D. M., Rose, P., Togneri, R., Tran, D., & Wagner, M. (2009). A blueprint for a comprehensive Australian English auditory-visual speech corpus. In M. Haugh, K. Burridge, J. Mulder, & P. Peters (
Eds. ), Selected proceedings of the 2008 HCSNet Workshop on Designing the Australian National Corpus (pp. 96-107). Somerville, MA: Cascadilla Proceedings Project.Abstract
Large auditory-visual (AV) speech corpora are the grist of modern research in speech science, but no such corpus exists for Australian English. This is unfortunate, for speech science is the brains behind speech technology and applications such as text-to-speech (TTS) synthesis, automatic speech recognition (ASR), speaker recognition and forensic identification, talking heads, and hearing prostheses. Advances in these research areas in Australia require a large corpus of Australian English. Here the authors describe a blueprint for building the Big Australian Speech Corpus (the Big ASC), a corpus of over 1,100 speakers from urban and rural Australia, including speakers of non-indigenous, indigenous, ethnocultural, and disordered forms of Australian English, each of whom would be sampled on three occasions in a range of speech tasks designed by the researchers who would be using the corpus. -
Cutler, A., Davis, C., & Kim, J. (2009). Non-automaticity of use of orthographic knowledge in phoneme evaluation. In Proceedings of the 10th Annual Conference of the International Speech Communication Association (Interspeech 2009) (pp. 380-383). Causal Productions Pty Ltd.
Abstract
Two phoneme goodness rating experiments addressed the role of orthographic knowledge in the evaluation of speech sounds. Ratings for the best tokens of /s/ were higher in words spelled with S (e.g., bless) than in words where /s/ was spelled with C (e.g., voice). This difference did not appear for analogous nonwords for which every lexical neighbour had either S or C spelling (pless, floice). Models of phonemic processing incorporating obligatory influence of lexical information in phonemic processing cannot explain this dissociation; the data are consistent with models in which phonemic decisions are not subject to necessary top-down lexical influence. -
Butterfield, S., & Cutler, A. (1990). Intonational cues to word boundaries in clear speech? In Proceedings of the Institute of Acoustics: Vol 12, part 10 (pp. 87-94). St. Albans, Herts.: Institute of Acoustics.
-
Cutler, A. (1990). Syllabic lengthening as a word boundary cue. In R. Seidl (
Ed. ), Proceedings of the 3rd Australian International Conference on Speech Science and Technology (pp. 324-328). Canberra: Australian Speech Science and Technology Association.Abstract
Bisyllabic sequences which could be interpreted as one word or two were produced in sentence contexts by a trained speaker, and syllabic durations measured. Listeners judged whether the bisyllables, excised from context, were one word or two. The proportion of two-word choices correlated positively with measured duration, but only for bisyllables stressed on the second syllable. The results may suggest a limit for listener sensitivity to syllabic lengthening as a word boundary cue. -
Cutler, A., Norris, D., & Van Ooijen, B. (1990). Vowels as phoneme detection targets. In Proceedings of the First International Conference on Spoken Language Processing (pp. 581-584).
Abstract
Phoneme detection is a psycholinguistic task in which listeners' response time to detect the presence of a pre-specified phoneme target is measured. Typically, detection tasks have used consonant targets. This paper reports two experiments in which subjects responded to vowels as phoneme detection targets. In the first experiment, targets occurred in real words, in the second in nonsense words. Response times were long by comparison with consonantal targets. Targets in initial syllables were responded to much more slowly than targets in second syllables. Strong vowels were responded to faster than reduced vowels in real words but not in nonwords. These results suggest that the process of phoneme detection produces different results for vowels and for consonants. We discuss possible explanations for this difference, in particular the possibility of language-specificity. -
Cutler, A., & Butterfield, S. (1986). The perceptual integrity of initial consonant clusters. In R. Lawrence (
Ed. ), Speech and Hearing: Proceedings of the Institute of Acoustics (pp. 31-36). Edinburgh: Institute of Acoustics. -
Cutler, A. (1983). Semantics, syntax and sentence accent. In M. Van den Broecke, & A. Cohen (
Eds. ), Proceedings of the Tenth International Congress of Phonetic Sciences (pp. 85-91). Dordrecht: Foris. -
Cutler, A. (1974). On saying what you mean without meaning what you say. In M. Galy, R. Fox, & A. Bruck (
Eds. ), Papers from the Tenth Regional Meeting, Chicago Linguistic Society (pp. 117-127). Chicago, Ill.: CLS.
Share this page