Displaying 1 - 19 of 19
-
Asano, Y., Yuan, C., Grohe, A.-K., Weber, A., Antoniou, M., & Cutler, A. (2020). Uptalk interpretation as a function of listening experience. In N. Minematsu, M. Kondo, T. Arai, & R. Hayashi (
Eds. ), Proceedings of Speech Prosody 2020 (pp. 735-739). Tokyo: ISCA. doi:10.21437/SpeechProsody.2020-150.Abstract
The term “uptalk” describes utterance-final pitch rises that carry no sentence-structural information. Uptalk is usually dialectal or sociolectal, and Australian English (AusEng) is particularly known for this attribute. We ask here whether experience with an uptalk variety affects listeners’ ability to categorise rising pitch contours on the basis of the timing and height of their onset and offset. Listeners were two groups of English-speakers (AusEng, and American English), and three groups of listeners with L2 English: one group with Mandarin as L1 and experience of listening to AusEng, one with German as L1 and experience of listening to AusEng, and one with German as L1 but no AusEng experience. They heard nouns (e.g. flower, piano) in the framework “Got a NOUN”, each ending with a pitch rise artificially manipulated on three contrasts: low vs. high rise onset, low vs. high rise offset and early vs. late rise onset. Their task was to categorise the tokens as “question” or “statement”, and we analysed the effect of the pitch contrasts on their judgements. Only the native AusEng listeners were able to use the pitch contrasts systematically in making these categorisations. -
Yu, J., Mailhammer, R., & Cutler, A. (2020). Vocabulary structure affects word recognition: Evidence from German listeners. In N. Minematsu, M. Kondo, T. Arai, & R. Hayashi (
Eds. ), Proceedings of Speech Prosody 2020 (pp. 474-478). Tokyo: ISCA. doi:10.21437/SpeechProsody.2020-97.Abstract
Lexical stress is realised similarly in English, German, and
Dutch. On a suprasegmental level, stressed syllables tend to be
longer and more acoustically salient than unstressed syllables;
segmentally, vowels in unstressed syllables are often reduced.
The frequency of unreduced unstressed syllables (where only
the suprasegmental cues indicate lack of stress) however,
differs across the languages. The present studies test whether
listener behaviour is affected by these vocabulary differences,
by investigating German listeners’ use of suprasegmental cues
to lexical stress in German and English word recognition. In a
forced-choice identification task, German listeners correctly
assigned single-syllable fragments (e.g., Kon-) to one of two
words differing in stress (KONto, konZEPT). Thus, German
listeners can exploit suprasegmental information for
identifying words. German listeners also performed above
chance in a similar task in English (with, e.g., DIver, diVERT),
i.e., their sensitivity to these cues also transferred to a nonnative
language. An English listener group, in contrast, failed
in the English fragment task. These findings mirror vocabulary
patterns: German has more words with unreduced unstressed
syllables than English does. -
Bruggeman, L., & Cutler, A. (2019). The dynamics of lexical activation and competition in bilinguals’ first versus second language. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (
Eds. ), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 20195) (pp. 1342-1346). Canberra, Australia: Australasian Speech Science and Technology Association Inc.Abstract
Speech input causes listeners to activate multiple
candidate words which then compete with one
another. These include onset competitors, that share a
beginning (bumper, butter), but also, counterintuitively,
rhyme competitors, sharing an ending
(bumper, jumper). In L1, competition is typically
stronger for onset than for rhyme. In L2, onset
competition has been attested but rhyme competition
has heretofore remained largely unexamined. We
assessed L1 (Dutch) and L2 (English) word
recognition by the same late-bilingual individuals. In
each language, eye gaze was recorded as listeners
heard sentences and viewed sets of drawings: three
unrelated, one depicting an onset or rhyme competitor
of a word in the input. Activation patterns revealed
substantial onset competition but no significant
rhyme competition in either L1 or L2. Rhyme
competition may thus be a “luxury” feature of
maximally efficient listening, to be abandoned when
resources are scarcer, as in listening by late
bilinguals, in either language. -
Cutler, A., Burchfield, A., & Antoniou, M. (2019). A criterial interlocutor tally for successful talker adaptation? In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (
Eds. ), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 20195) (pp. 1485-1489). Canberra, Australia: Australasian Speech Science and Technology Association Inc.Abstract
Part of the remarkable efficiency of listening is
accommodation to unfamiliar talkers’ specific
pronunciations by retuning of phonemic intercategory
boundaries. Such retuning occurs in second
(L2) as well as first language (L1); however, recent
research with emigrés revealed successful adaptation
in the environmental L2 but, unprecedentedly, not in
L1 despite continuing L1 use. A possible explanation
involving relative exposure to novel talkers is here
tested in heritage language users with Mandarin as
family L1 and English as environmental language. In
English, exposure to an ambiguous sound in
disambiguating word contexts prompted the expected
adjustment of phonemic boundaries in subsequent
categorisation. However, no adjustment occurred in
Mandarin, again despite regular use. Participants
reported highly asymmetric interlocutor counts in the
two languages. We conclude that successful retuning
ability requires regular exposure to novel talkers in
the language in question, a criterion not met for the
emigrés’ or for these heritage users’ L1. -
Joo, H., Jang, J., Kim, S., Cho, T., & Cutler, A. (2019). Prosodic structural effects on coarticulatory vowel nasalization in Australian English in comparison to American English. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (
Eds. ), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 20195) (pp. 835-839). Canberra, Australia: Australasian Speech Science and Technology Association Inc.Abstract
This study investigates effects of prosodic factors (prominence, boundary) on coarticulatory Vnasalization in Australian English (AusE) in CVN and NVC in comparison to those in American English
(AmE). As in AmE, prominence was found to
lengthen N, but to reduce V-nasalization, enhancing N’s nasality and V’s orality, respectively (paradigmatic contrast enhancement). But the prominence effect in CVN was more robust than that in AmE. Again similar to findings in AmE, boundary
induced a reduction of N-duration and V-nasalization phrase-initially (syntagmatic contrast enhancement), and increased the nasality of both C and V phrasefinally.
But AusE showed some differences in terms
of the magnitude of V nasalization and N duration. The results suggest that the linguistic contrast enhancements underlie prosodic-structure modulation of coarticulatory V-nasalization in
comparable ways across dialects, while the fine phonetic detail indicates that the phonetics-prosody interplay is internalized in the individual dialect’s phonetic grammar. -
Ip, M. H. K., & Cutler, A. (2018). Asymmetric efficiency of juncture perception in L1 and L2. In K. Klessa, J. Bachan, A. Wagner, M. Karpiński, & D. Śledziński (
Eds. ), Proceedings of Speech Prosody 2018 (pp. 289-296). Baixas, France: ISCA. doi:10.21437/SpeechProsody.2018-59.Abstract
In two experiments, Mandarin listeners resolved potential syntactic ambiguities in spoken utterances in (a) their native language (L1) and (b) English which they had learned as a second language (L2). A new disambiguation task was used, requiring speeded responses to select the correct meaning for structurally ambiguous sentences. Importantly, the ambiguities used in the study are identical in Mandarin and in English, and production data show that prosodic disambiguation of this type of ambiguity is also realised very similarly in the two languages. The perceptual results here showed however that listeners’ response patterns differed for L1 and L2, although there was a significant increase in similarity between the two response patterns with increasing exposure to the L2. Thus identical ambiguity and comparable disambiguation patterns in L1 and L2 do not lead to immediate application of the appropriate L1 listening strategy to L2; instead, it appears that such a strategy may have to be learned anew for the L2. -
Ip, M. H. K., & Cutler, A. (2018). Cue equivalence in prosodic entrainment for focus detection. In J. Epps, J. Wolfe, J. Smith, & C. Jones (
Eds. ), Proceedings of the 17th Australasian International Conference on Speech Science and Technology (pp. 153-156).Abstract
Using a phoneme detection task, the present series of
experiments examines whether listeners can entrain to
different combinations of prosodic cues to predict where focus
will fall in an utterance. The stimuli were recorded by four
female native speakers of Australian English who happened to
have used different prosodic cues to produce sentences with
prosodic focus: a combination of duration cues, mean and
maximum F0, F0 range, and longer pre-target interval before
the focused word onset, only mean F0 cues, only pre-target
interval, and only duration cues. Results revealed that listeners
can entrain in almost every condition except for where
duration was the only reliable cue. Our findings suggest that
listeners are flexible in the cues they use for focus processing. -
Cutler, A., Burchfield, L. A., & Antoniou, M. (2018). Factors affecting talker adaptation in a second language. In J. Epps, J. Wolfe, J. Smith, & C. Jones (
Eds. ), Proceedings of the 17th Australasian International Conference on Speech Science and Technology (pp. 33-36).Abstract
Listeners adapt rapidly to previously unheard talkers by
adjusting phoneme categories using lexical knowledge, in a
process termed lexically-guided perceptual learning. Although
this is firmly established for listening in the native language
(L1), perceptual flexibility in second languages (L2) is as yet
less well understood. We report two experiments examining L1
and L2 perceptual learning, the first in Mandarin-English late
bilinguals, the second in Australian learners of Mandarin. Both
studies showed stronger learning in L1; in L2, however,
learning appeared for the English-L1 group but not for the
Mandarin-L1 group. Phonological mapping differences from
the L1 to the L2 are suggested as the reason for this result. -
Burnham, D., Ambikairajah, E., Arciuli, J., Bennamoun, M., Best, C. T., Bird, S., Butcher, A. R., Cassidy, S., Chetty, G., Cox, F. M., Cutler, A., Dale, R., Epps, J. R., Fletcher, J. M., Goecke, R., Grayden, D. B., Hajek, J. T., Ingram, J. C., Ishihara, S., Kemp, N. and 10 moreBurnham, D., Ambikairajah, E., Arciuli, J., Bennamoun, M., Best, C. T., Bird, S., Butcher, A. R., Cassidy, S., Chetty, G., Cox, F. M., Cutler, A., Dale, R., Epps, J. R., Fletcher, J. M., Goecke, R., Grayden, D. B., Hajek, J. T., Ingram, J. C., Ishihara, S., Kemp, N., Kinoshita, Y., Kuratate, T., Lewis, T. W., Loakes, D. E., Onslow, M., Powers, D. M., Rose, P., Togneri, R., Tran, D., & Wagner, M. (2009). A blueprint for a comprehensive Australian English auditory-visual speech corpus. In M. Haugh, K. Burridge, J. Mulder, & P. Peters (
Eds. ), Selected proceedings of the 2008 HCSNet Workshop on Designing the Australian National Corpus (pp. 96-107). Somerville, MA: Cascadilla Proceedings Project.Abstract
Large auditory-visual (AV) speech corpora are the grist of modern research in speech science, but no such corpus exists for Australian English. This is unfortunate, for speech science is the brains behind speech technology and applications such as text-to-speech (TTS) synthesis, automatic speech recognition (ASR), speaker recognition and forensic identification, talking heads, and hearing prostheses. Advances in these research areas in Australia require a large corpus of Australian English. Here the authors describe a blueprint for building the Big Australian Speech Corpus (the Big ASC), a corpus of over 1,100 speakers from urban and rural Australia, including speakers of non-indigenous, indigenous, ethnocultural, and disordered forms of Australian English, each of whom would be sampled on three occasions in a range of speech tasks designed by the researchers who would be using the corpus. -
Cutler, A., Davis, C., & Kim, J. (2009). Non-automaticity of use of orthographic knowledge in phoneme evaluation. In Proceedings of the 10th Annual Conference of the International Speech Communication Association (Interspeech 2009) (pp. 380-383). Causal Productions Pty Ltd.
Abstract
Two phoneme goodness rating experiments addressed the role of orthographic knowledge in the evaluation of speech sounds. Ratings for the best tokens of /s/ were higher in words spelled with S (e.g., bless) than in words where /s/ was spelled with C (e.g., voice). This difference did not appear for analogous nonwords for which every lexical neighbour had either S or C spelling (pless, floice). Models of phonemic processing incorporating obligatory influence of lexical information in phonemic processing cannot explain this dissociation; the data are consistent with models in which phonemic decisions are not subject to necessary top-down lexical influence. -
Cooper, N., & Cutler, A. (2004). Perception of non-native phonemes in noise. In S. Kin, & M. J. Bae (
Eds. ), Proceedings of the 8th International Conference on Spoken Language Processing (Interspeech 2004-ICSLP) (pp. 469-472). Seoul: Sunjijn Printing Co.Abstract
We report an investigation of the perception of American English phonemes by Dutch listeners proficient in English. Listeners identified either the consonant or the vowel in most possible English CV and VC syllables. The syllables were embedded in multispeaker babble at three signal-to-noise ratios (16 dB, 8 dB, and 0 dB). Effects of signal-to-noise ratio on vowel and consonant identification are discussed as a function of syllable position and of relationship to the native phoneme inventory. Comparison of the results with previously reported data from native listeners reveals that noise affected the responding of native and non-native listeners similarly. -
Cutler, A., Norris, D., & Sebastián-Gallés, N. (2004). Phonemic repertoire and similarity within the vocabulary. In S. Kin, & M. J. Bae (
Eds. ), Proceedings of the 8th International Conference on Spoken Language Processing (Interspeech 2004-ICSLP) (pp. 65-68). Seoul: Sunjijn Printing Co.Abstract
Language-specific differences in the size and distribution of the phonemic repertoire can have implications for the task facing listeners in recognising spoken words. A language with more phonemes will allow shorter words and reduced embedding of short words within longer ones, decreasing the potential for spurious lexical competitors to be activated by speech signals. We demonstrate that this is the case via comparative analyses of the vocabularies of English and Spanish. A language which uses suprasegmental as well as segmental contrasts, however, can substantially reduce the extent of spurious embedding. -
Allerhand, M., Butterfield, S., Cutler, A., & Patterson, R. (1992). Assessing syllable strength via an auditory model. In Proceedings of the Institute of Acoustics: Vol. 14 Part 6 (pp. 297-304). St. Albans, Herts: Institute of Acoustics.
-
Cutler, A., Kearns, R., Norris, D., & Scott, D. (1992). Listeners’ responses to extraneous signals coincident with English and French speech. In J. Pittam (
Ed. ), Proceedings of the 4th Australian International Conference on Speech Science and Technology (pp. 666-671). Canberra: Australian Speech Science and Technology Association.Abstract
English and French listeners performed two tasks - click location and speeded click detection - with both English and French sentences, closely matched for syntactic and phonological structure. Clicks were located more accurately in open- than in closed-class words in both English and French; they were detected more rapidly in open- than in closed-class words in English, but not in French. The two listener groups produced the same pattern of responses, suggesting that higher-level linguistic processing was not involved in these tasks. -
Cutler, A., & Robinson, T. (1992). Response time as a metric for comparison of speech recognition by humans and machines. In J. Ohala, T. Neary, & B. Derwing (
Eds. ), Proceedings of the Second International Conference on Spoken Language Processing: Vol. 1 (pp. 189-192). Alberta: University of Alberta.Abstract
The performance of automatic speech recognition systems is usually assessed in terms of error rate. Human speech recognition produces few errors, but relative difficulty of processing can be assessed via response time techniques. We report the construction of a measure analogous to response time in a machine recognition system. This measure may be compared directly with human response times. We conducted a trial comparison of this type at the phoneme level, including both tense and lax vowels and a variety of consonant classes. The results suggested similarities between human and machine processing in the case of consonants, but differences in the case of vowels. -
McQueen, J. M., & Cutler, A. (1992). Words within words: Lexical statistics and lexical access. In J. Ohala, T. Neary, & B. Derwing (
Eds. ), Proceedings of the Second International Conference on Spoken Language Processing: Vol. 1 (pp. 221-224). Alberta: University of Alberta.Abstract
This paper presents lexical statistics on the pattern of occurrence of words embedded in other words. We report the results of an analysis of 25000 words, varying in length from two to six syllables, extracted from a phonetically-coded English dictionary (The Longman Dictionary of Contemporary English). Each syllable, and each string of syllables within each word was checked against the dictionary. Two analyses are presented: the first used a complete list of polysyllables, with look-up on the entire dictionary; the second used a sublist of content words, counting only embedded words which were themselves content words. The results have important implications for models of human speech recognition. The efficiency of these models depends, in different ways, on the number and location of words within words. -
Norris, D., Van Ooijen, B., & Cutler, A. (1992). Speeded detection of vowels and steady-state consonants. In J. Ohala, T. Neary, & B. Derwing (
Eds. ), Proceedings of the Second International Conference on Spoken Language Processing; Vol. 2 (pp. 1055-1058). Alberta: University of Alberta.Abstract
We report two experiments in which vowels and steady-state consonants served as targets in a speeded detection task. In the first experiment, two vowels were compared with one voiced and once unvoiced fricative. Response times (RTs) to the vowels were longer than to the fricatives. The error rate was higher for the consonants. Consonants in word-final position produced the shortest RTs, For the vowels, RT correlated negatively with target duration. In the second experiment, the same two vowel targets were compared with two nasals. This time there was no significant difference in RTs, but the error rate was still significantly higher for the consonants. Error rate and length correlated negatively for the vowels only. We conclude that RT differences between phonemes are independent of vocalic or consonantal status. Instead, we argue that the process of phoneme detection reflects more finely grained differences in acoustic/articulatory structure within the phonemic repertoire. -
Cutler, A. (1974). On saying what you mean without meaning what you say. In M. Galy, R. Fox, & A. Bruck (
Eds. ), Papers from the Tenth Regional Meeting, Chicago Linguistic Society (pp. 117-127). Chicago, Ill.: CLS. -
Cutler, A. (1970). An experimental method for semantic field study. Linguistic Communications, 2, 87-94.
Abstract
This paper emphasizes the need for empirical research and objective discovery procedures in semantics, and illustrates a method by which these goals may be obtained. The aim of the methodology described is to provide a description of the internal structure of a semantic field by eliciting the description--in an objective, standardized manner--from a representative group of native speakers. This would produce results that would be equally obtainable by any linguist using the same method under the same conditions with a similarly representative set of informants. The standardized method suggested by the author is the Semantic Differential developed by C. E. Osgood in the 1950's. Applying this method to semantic research, it is further hypothesized that, should different members of a semantic field be employed as concepts on a Semantic Differential task, a factor analysis of the results would reveal the dimensions operative within the body of data. The author demonstrates the use of the Semantic Differential and factor analysis in an actual experiment.
Share this page