Anne Cutler

Publications

Displaying 1 - 43 of 43
  • Choi, J., Broersma, M., & Cutler, A. (2018). Phonetic learning is not enhanced by sequential exposure to more than one language. Linguistic Research, 35(3), 567-581. doi:10.17250/khisli.35.3.201812.006.

    Abstract

    Several studies have documented that international adoptees, who in early years have experienced a change from a language used in their birth country to a new language in an adoptive country, benefit from the limited early exposure to the birth language when relearning that language’s sounds later in life. The adoptees’ relearning advantages have been argued to be conferred by lasting birth-language knowledge obtained from the early exposure. However, it is also plausible to assume that the advantages may arise from adoptees’ superior ability to learn language sounds in general, as a result of their unusual linguistic experience, i.e., exposure to multiple languages in sequence early in life. If this is the case, then the adoptees’ relearning benefits should generalize to previously unheard language sounds, rather than be limited to their birth-language sounds. In the present study, adult Korean adoptees in the Netherlands and matched Dutch-native controls were trained on identifying a Japanese length distinction to which they had never been exposed before. The adoptees and Dutch controls did not differ on any test carried out before, during, or after the training, indicating that observed adoptee advantages for birth-language relearning do not generalize to novel, previously unheard language sounds. The finding thus fails to support the suggestion that birth-language relearning advantages may arise from enhanced ability to learn language sounds in general conferred by early experience in multiple languages. Rather, our finding supports the original contention that such advantages involve memory traces obtained before adoption
  • Ip, M. H. K., & Cutler, A. (2018). Asymmetric efficiency of juncture perception in L1 and L2. In K. Klessa, J. Bachan, A. Wagner, M. Karpiński, & D. Śledziński (Eds.), Proceedings of Speech Prosody 2018 (pp. 289-296). Baixas, France: ISCA. doi:10.21437/SpeechProsody.2018-59.

    Abstract

    In two experiments, Mandarin listeners resolved potential syntactic ambiguities in spoken utterances in (a) their native language (L1) and (b) English which they had learned as a second language (L2). A new disambiguation task was used, requiring speeded responses to select the correct meaning for structurally ambiguous sentences. Importantly, the ambiguities used in the study are identical in Mandarin and in English, and production data show that prosodic disambiguation of this type of ambiguity is also realised very similarly in the two languages. The perceptual results here showed however that listeners’ response patterns differed for L1 and L2, although there was a significant increase in similarity between the two response patterns with increasing exposure to the L2. Thus identical ambiguity and comparable disambiguation patterns in L1 and L2 do not lead to immediate application of the appropriate L1 listening strategy to L2; instead, it appears that such a strategy may have to be learned anew for the L2.
  • Ip, M. H. K., & Cutler, A. (2018). Cue equivalence in prosodic entrainment for focus detection. In J. Epps, J. Wolfe, J. Smith, & C. Jones (Eds.), Proceedings of the 17th Australasian International Conference on Speech Science and Technology (pp. 153-156).

    Abstract

    Using a phoneme detection task, the present series of experiments examines whether listeners can entrain to different combinations of prosodic cues to predict where focus will fall in an utterance. The stimuli were recorded by four female native speakers of Australian English who happened to have used different prosodic cues to produce sentences with prosodic focus: a combination of duration cues, mean and maximum F0, F0 range, and longer pre-target interval before the focused word onset, only mean F0 cues, only pre-target interval, and only duration cues. Results revealed that listeners can entrain in almost every condition except for where duration was the only reliable cue. Our findings suggest that listeners are flexible in the cues they use for focus processing.
  • Cutler, A., Burchfield, L. A., & Antoniou, M. (2018). Factors affecting talker adaptation in a second language. In J. Epps, J. Wolfe, J. Smith, & C. Jones (Eds.), Proceedings of the 17th Australasian International Conference on Speech Science and Technology (pp. 33-36).

    Abstract

    Listeners adapt rapidly to previously unheard talkers by adjusting phoneme categories using lexical knowledge, in a process termed lexically-guided perceptual learning. Although this is firmly established for listening in the native language (L1), perceptual flexibility in second languages (L2) is as yet less well understood. We report two experiments examining L1 and L2 perceptual learning, the first in Mandarin-English late bilinguals, the second in Australian learners of Mandarin. Both studies showed stronger learning in L1; in L2, however, learning appeared for the English-L1 group but not for the Mandarin-L1 group. Phonological mapping differences from the L1 to the L2 are suggested as the reason for this result.
  • Cutler, A., & Farrell, J. (2018). Listening in first and second language. In J. I. Liontas (Ed.), The TESOL encyclopedia of language teaching. New York: Wiley. doi:10.1002/9781118784235.eelt0583.

    Abstract

    Listeners' recognition of spoken language involves complex decoding processes: The continuous speech stream must be segmented into its component words, and words must be recognized despite great variability in their pronunciation (due to talker differences, or to influence of phonetic context, or to speech register) and despite competition from many spuriously present forms supported by the speech signal. L1 listeners deal more readily with all levels of this complexity than L2 listeners. Fortunately, the decoding processes necessary for competent L2 listening can be taught in the classroom. Evidence-based methodologies targeted at the development of efficient speech decoding include teaching of minimal pairs, of phonotactic constraints, and of reduction processes, as well as the use of dictation and L2 video captions.
  • Johnson, E. K., Bruggeman, L., & Cutler, A. (2018). Abstraction and the (misnamed) language familiarity effect. Cognitive Science, 42, 633-645. doi:10.1111/cogs.12520.

    Abstract

    Talkers are recognized more accurately if they are speaking the listeners’ native language rather than an unfamiliar language. This “language familiarity effect” has been shown not to depend upon comprehension and must instead involve language sound patterns. We further examine the level of sound-pattern processing involved, by comparing talker recognition in foreign languages versus two varieties of English, by (a) English speakers of one variety, (b) English speakers of the other variety, and (c) non-native listeners (more familiar with one of the varieties). All listener groups performed better with native than foreign speech, but no effect of language variety appeared: Native listeners discriminated talkers equally well in each, with the native variety never outdoing the other variety, and non-native listeners discriminated talkers equally poorly in each, irrespective of the variety's familiarity. The results suggest that this talker recognition effect rests not on simple familiarity, but on an abstract level of phonological processing
  • Kidd, E., Junge, C., Spokes, T., Morrison, L., & Cutler, A. (2018). Individual differences in infant speech segmentation: Achieving the lexical shift. Infancy, 23(6), 770-794. doi:10.1111/infa.12256.

    Abstract

    We report a large‐scale electrophysiological study of infant speech segmentation, in which over 100 English‐acquiring 9‐month‐olds were exposed to unfamiliar bisyllabic words embedded in sentences (e.g., He saw a wild eagle up there), after which their brain responses to either the just‐familiarized word (eagle) or a control word (coral) were recorded. When initial exposure occurs in continuous speech, as here, past studies have reported that even somewhat older infants do not reliably recognize target words, but that successful segmentation varies across children. Here, we both confirm and further uncover the nature of this variation. The segmentation response systematically varied across individuals and was related to their vocabulary development. About one‐third of the group showed a left‐frontally located relative negativity in response to familiar versus control targets, which has previously been described as a mature response. Another third showed a similarly located positive‐going reaction (a previously described immature response), and the remaining third formed an intermediate grouping that was primarily characterized by an initial response delay. A fine‐grained group‐level analysis suggested that a developmental shift to a lexical mode of processing occurs toward the end of the first year, with variation across individual infants in the exact timing of this shift.

    Additional information

    supporting information
  • Norris, D., McQueen, J. M., & Cutler, A. (2018). Commentary on “Interaction in spoken word recognition models". Frontiers in Psychology, 9: 1568. doi:10.3389/fpsyg.2018.01568.
  • Burnham, D., Ambikairajah, E., Arciuli, J., Bennamoun, M., Best, C. T., Bird, S., Butcher, A. R., Cassidy, S., Chetty, G., Cox, F. M., Cutler, A., Dale, R., Epps, J. R., Fletcher, J. M., Goecke, R., Grayden, D. B., Hajek, J. T., Ingram, J. C., Ishihara, S., Kemp, N. and 10 moreBurnham, D., Ambikairajah, E., Arciuli, J., Bennamoun, M., Best, C. T., Bird, S., Butcher, A. R., Cassidy, S., Chetty, G., Cox, F. M., Cutler, A., Dale, R., Epps, J. R., Fletcher, J. M., Goecke, R., Grayden, D. B., Hajek, J. T., Ingram, J. C., Ishihara, S., Kemp, N., Kinoshita, Y., Kuratate, T., Lewis, T. W., Loakes, D. E., Onslow, M., Powers, D. M., Rose, P., Togneri, R., Tran, D., & Wagner, M. (2009). A blueprint for a comprehensive Australian English auditory-visual speech corpus. In M. Haugh, K. Burridge, J. Mulder, & P. Peters (Eds.), Selected proceedings of the 2008 HCSNet Workshop on Designing the Australian National Corpus (pp. 96-107). Somerville, MA: Cascadilla Proceedings Project.

    Abstract

    Large auditory-visual (AV) speech corpora are the grist of modern research in speech science, but no such corpus exists for Australian English. This is unfortunate, for speech science is the brains behind speech technology and applications such as text-to-speech (TTS) synthesis, automatic speech recognition (ASR), speaker recognition and forensic identification, talking heads, and hearing prostheses. Advances in these research areas in Australia require a large corpus of Australian English. Here the authors describe a blueprint for building the Big Australian Speech Corpus (the Big ASC), a corpus of over 1,100 speakers from urban and rural Australia, including speakers of non-indigenous, indigenous, ethnocultural, and disordered forms of Australian English, each of whom would be sampled on three occasions in a range of speech tasks designed by the researchers who would be using the corpus.
  • Cutler, A. (2009). Greater sensitivity to prosodic goodness in non-native than in native listeners. Journal of the Acoustical Society of America, 125, 3522-3525. doi:10.1121/1.3117434.

    Abstract

    English listeners largely disregard suprasegmental cues to stress in recognizing words. Evidence for this includes the demonstration of Fear et al. [J. Acoust. Soc. Am. 97, 1893–1904 (1995)] that cross-splicings are tolerated between stressed and unstressed full vowels (e.g., au- of autumn, automata). Dutch listeners, however, do exploit suprasegmental stress cues in recognizing native-language words. In this study, Dutch listeners were presented with English materials from the study of Fear et al. Acceptability ratings by these listeners revealed sensitivity to suprasegmental mismatch, in particular, in replacements of unstressed full vowels by higher-stressed vowels, thus evincing greater sensitivity to prosodic goodness than had been shown by the original native listener group.
  • Cutler, A., Davis, C., & Kim, J. (2009). Non-automaticity of use of orthographic knowledge in phoneme evaluation. In Proceedings of the 10th Annual Conference of the International Speech Communication Association (Interspeech 2009) (pp. 380-383). Causal Productions Pty Ltd.

    Abstract

    Two phoneme goodness rating experiments addressed the role of orthographic knowledge in the evaluation of speech sounds. Ratings for the best tokens of /s/ were higher in words spelled with S (e.g., bless) than in words where /s/ was spelled with C (e.g., voice). This difference did not appear for analogous nonwords for which every lexical neighbour had either S or C spelling (pless, floice). Models of phonemic processing incorporating obligatory influence of lexical information in phonemic processing cannot explain this dissociation; the data are consistent with models in which phonemic decisions are not subject to necessary top-down lexical influence.
  • Cutler, A. (2009). Psycholinguistics in our time. In P. Rabbitt (Ed.), Inside psychology: A science over 50 years (pp. 91-101). Oxford: Oxford University Press.
  • Cutler, A., Otake, T., & McQueen, J. M. (2009). Vowel devoicing and the perception of spoken Japanese words. Journal of the Acoustical Society of America, 125(3), 1693-1703. doi:10.1121/1.3075556.

    Abstract

    Three experiments, in which Japanese listeners detected Japanese words embedded in nonsense sequences, examined the perceptual consequences of vowel devoicing in that language. Since vowelless sequences disrupt speech segmentation [Norris et al. (1997). Cognit. Psychol. 34, 191– 243], devoicing is potentially problematic for perception. Words in initial position in nonsense sequences were detected more easily when followed by a sequence containing a vowel than by a vowelless segment (with or without further context), and vowelless segments that were potential devoicing environments were no easier than those not allowing devoicing. Thus asa, “morning,” was easier in asau or asazu than in all of asap, asapdo, asaf, or asafte, despite the fact that the /f/ in the latter two is a possible realization of fu, with devoiced [u]. Japanese listeners thus do not treat devoicing contexts as if they always contain vowels. Words in final position in nonsense sequences, however, produced a different pattern: here, preceding vowelless contexts allowing devoicing impeded word detection less strongly (so, sake was detected less accurately, but not less rapidly, in nyaksake—possibly arising from nyakusake—than in nyagusake). This is consistent with listeners treating consonant sequences as potential realizations of parts of existing lexical candidates wherever possible.
  • Kooijman, V., Hagoort, P., & Cutler, A. (2009). Prosodic structure in early word segmentation: ERP evidence from Dutch ten-month-olds. Infancy, 14, 591 -612. doi:10.1080/15250000903263957.

    Abstract

    Recognizing word boundaries in continuous speech requires detailed knowledge of the native language. In the first year of life, infants acquire considerable word segmentation abilities. Infants at this early stage in word segmentation rely to a large extent on the metrical pattern of their native language, at least in stress-based languages. In Dutch and English (both languages with a preferred trochaic stress pattern), segmentation of strong-weak words develops rapidly between 7 and 10 months of age. Nevertheless, trochaic languages contain not only strong-weak words but also words with a weak-strong stress pattern. In this article, we present electrophysiological evidence of the beginnings of weak-strong word segmentation in Dutch 10-month-olds. At this age, the ability to combine different cues for efficient word segmentation does not yet seem to be completely developed. We provide evidence that Dutch infants still largely rely on strong syllables, even for the segmentation of weak-strong words.
  • Tyler, M., & Cutler, A. (2009). Cross-language differences in cue use for speech segmentation. Journal of the Acoustical Society of America, 126, 367-376. doi:10.1121/1.3129127.

    Abstract

    Two artificial-language learning experiments directly compared English, French, and Dutch listeners’ use of suprasegmental cues for continuous-speech segmentation. In both experiments, listeners heard unbroken sequences of consonant-vowel syllables, composed of recurring three- and four-syllable “words.” These words were demarcated by(a) no cue other than transitional probabilities induced by their recurrence, (b) a consistent left-edge cue, or (c) a consistent right-edge cue. Experiment 1 examined a vowel lengthening cue. All three listener groups benefited from this cue in right-edge position; none benefited from it in left-edge position. Experiment 2 examined a pitch-movement cue. English listeners used this cue in left-edge position, French listeners used it in right-edge position, and Dutch listeners used it in both positions. These findings are interpreted as evidence of both language-universal and language-specific effects. Final lengthening is a language-universal effect expressing a more general (non-linguistic) mechanism. Pitch movement expresses prominence which has characteristically different placements across languages: typically at right edges in French, but at left edges in English and Dutch. Finally, stress realization in English versus Dutch encourages greater attention to suprasegmental variation by Dutch than by English listeners, allowing Dutch listeners to benefit from an informative pitch-movement cue even in an uncharacteristic position.
  • Costa, A., Cutler, A., & Sebastian-Galles, N. (1998). Effects of phoneme repertoire on phoneme decision. Perception and Psychophysics, 60, 1022-1031.

    Abstract

    In three experiments, listeners detected vowel or consonant targets in lists of CV syllables constructed from five vowels and five consonants. Responses were faster in a predictable context (e.g., listening for a vowel target in a list of syllables all beginning with the same consonant) than in an unpredictable context (e.g., listening for a vowel target in a list of syllables beginning with different consonants). In Experiment 1, the listeners’ native language was Dutch, in which vowel and consonant repertoires are similar in size. The difference between predictable and unpredictable contexts was comparable for vowel and consonant targets. In Experiments 2 and 3, the listeners’ native language was Spanish, which has four times as many consonants as vowels; here effects of an unpredictable consonant context on vowel detection were significantly greater than effects of an unpredictable vowel context on consonant detection. This finding suggests that listeners’ processing of phonemes takes into account the constitution of their language’s phonemic repertoire and the implications that this has for contextual variability.
  • Cutler, A., & Otake, T. (1998). Assimilation of place in Japanese and Dutch. In R. Mannell, & J. Robert-Ribes (Eds.), Proceedings of the Fifth International Conference on Spoken Language Processing: vol. 5 (pp. 1751-1754). Sydney: ICLSP.

    Abstract

    Assimilation of place of articulation across a nasal and a following stop consonant is obligatory in Japanese, but not in Dutch. In four experiments the processing of assimilated forms by speakers of Japanese and Dutch was compared, using a task in which listeners blended pseudo-word pairs such as ranga-serupa. An assimilated blend of this pair would be rampa, an unassimilated blend rangpa. Japanese listeners produced significantly more assimilated than unassimilated forms, both with pseudo-Japanese and pseudo-Dutch materials, while Dutch listeners produced significantly more unassimilated than assimilated forms in each materials set. This suggests that Japanese listeners, whose native-language phonology involves obligatory assimilation constraints, represent the assimilated nasals in nasal-stop sequences as unmarked for place of articulation, while Dutch listeners, who are accustomed to hearing unassimilated forms, represent the same nasal segments as marked for place of articulation.
  • Cutler, A. (1998). How listeners find the right words. In Proceedings of the Sixteenth International Congress on Acoustics: Vol. 2 (pp. 1377-1380). Melville, NY: Acoustical Society of America.

    Abstract

    Languages contain tens of thousands of words, but these are constructed from a tiny handful of phonetic elements. Consequently, words resemble one another, or can be embedded within one another, a coup stick snot with standing. me process of spoken-word recognition by human listeners involves activation of multiple word candidates consistent with the input, and direct competition between activated candidate words. Further, human listeners are sensitive, at an early, prelexical, stage of speeeh processing, to constraints on what could potentially be a word of the language.
  • Cutler, A., Treiman, R., & Van Ooijen, B. (1998). Orthografik inkoncistensy ephekts in foneme detektion? In R. Mannell, & J. Robert-Ribes (Eds.), Proceedings of the Fifth International Conference on Spoken Language Processing: Vol. 6 (pp. 2783-2786). Sydney: ICSLP.

    Abstract

    The phoneme detection task is widely used in spoken word recognition research. Alphabetically literate participants, however, are more used to explicit representations of letters than of phonemes. The present study explored whether phoneme detection is sensitive to how target phonemes are, or may be, orthographically realised. Listeners detected the target sounds [b,m,t,f,s,k] in word-initial position in sequences of isolated English words. Response times were faster to the targets [b,m,t], which have consistent word-initial spelling, than to the targets [f,s,k], which are inconsistently spelled, but only when listeners’ attention was drawn to spelling by the presence in the experiment of many irregularly spelled fillers. Within the inconsistent targets [f,s,k], there was no significant difference between responses to targets in words with majority and minority spellings. We conclude that performance in the phoneme detection task is not necessarily sensitive to orthographic effects, but that salient orthographic manipulation can induce such sensitivity.
  • Cutler, A. (1998). Prosodic structure and word recognition. In A. D. Friederici (Ed.), Language comprehension: A biological perspective (pp. 41-70). Heidelberg: Springer.
  • Cutler, A. (1998). The recognition of spoken words with variable representations. In D. Duez (Ed.), Proceedings of the ESCA Workshop on Sound Patterns of Spontaneous Speech (pp. 83-92). Aix-en-Provence: Université de Aix-en-Provence.
  • Kuijpers, C. T., Coolen, R., Houston, D., & Cutler, A. (1998). Using the head-turning technique to explore cross-linguistic performance differences. In C. Rovee-Collier, L. Lipsitt, & H. Hayne (Eds.), Advances in infancy research: Vol. 12 (pp. 205-220). Stamford: Ablex.
  • McQueen, J. M., & Cutler, A. (1998). Morphology in word recognition. In A. M. Zwicky, & A. Spencer (Eds.), The handbook of morphology (pp. 406-427). Oxford: Blackwell.
  • McQueen, J. M., & Cutler, A. (1998). Spotting (different kinds of) words in (different kinds of) context. In R. Mannell, & J. Robert-Ribes (Eds.), Proceedings of the Fifth International Conference on Spoken Language Processing: Vol. 6 (pp. 2791-2794). Sydney: ICSLP.

    Abstract

    The results of a word-spotting experiment are presented in which Dutch listeners tried to spot different types of bisyllabic Dutch words embedded in different types of nonsense contexts. Embedded verbs were not reliably harder to spot than embedded nouns; this suggests that nouns and verbs are recognised via the same basic processes. Iambic words were no harder to spot than trochaic words, suggesting that trochaic words are not in principle easier to recognise than iambic words. Words were harder to spot in consonantal contexts (i.e., contexts which themselves could not be words) than in longer contexts which contained at least one vowel (i.e., contexts which, though not words, were possible words of Dutch). A control experiment showed that this difference was not due to acoustic differences between the words in each context. The results support the claim that spoken-word recognition is sensitive to the viability of sound sequences as possible words.
  • Botelho da Silva, T., & Cutler, A. (1993). Ill-formedness and transformability in Portuguese idioms. In C. Cacciari, & P. Tabossi (Eds.), Idioms: Processing, structure and interpretation (pp. 129-143). Hillsdale, NJ: Erlbaum.
  • Cutler, A., Kearns, R., Norris, D., & Scott, D. R. (1993). Problems with click detection: Insights from cross-linguistic comparisons. Speech Communication, 13, 401-410. doi:10.1016/0167-6393(93)90038-M.

    Abstract

    Cross-linguistic comparisons may shed light on the levels of processing involved in the performance of psycholinguistic tasks. For instance, if the same pattern of results appears whether or not subjects understand the experimental materials, it may be concluded that the results do not reflect higher-level linguistic processing. In the present study, English and French listeners performed two tasks - click location and speeded click detection - with both English and French sentences, closely matched for syntactic and phonological structure. Clicks were located more accurately in open- than in closed-class words in both English and French; they were detected more rapidly in open- than in closed-class words in English, but not in French. The two listener groups produced the same pattern of responses, suggesting that higher-level linguistic processing was not involved in the listeners' responses. It is concluded that click detection tasks are primarily sensitive to low-level (e.g. acoustic) effects, and hence are not well suited to the investigation of linguistic processing.
  • Cutler, A. (1993). Segmentation problems, rhythmic solutions. Lingua, 92, 81-104. doi:10.1016/0024-3841(94)90338-7.

    Abstract

    The lexicon contains discrete entries, which must be located in speech input in order for speech to be understood; but the continuity of speech signals means that lexical access from spoken input involves a segmentation problem for listeners. The speech environment of prelinguistic infants may not provide special information to assist the infant listeners in solving this problem. Mature language users in possession of a lexicon might be thought to be able to avoid explicit segmentation of speech by relying on information from successful lexical access; however, evidence from adult perceptual studies indicates that listeners do use explicit segmentation procedures. These procedures differ across languages and seem to exploit language-specific rhythmic structure. Efficient as these procedures are, they may not have been developed in response to statistical properties of the input, because bilinguals, equally competent in two languages, apparently only possess one rhythmic segmentation procedure. The origin of rhythmic segmentation may therefore lie in the infant's exploitation of rhythm to solve the segmentation problem and gain a first toehold on lexical acquisition. Recent evidence from speech production and perception studies with prelinguistic infants supports the claim that infants are sensitive to rhythmic structure and its relationship to lexical segmentation.
  • Cutler, A. (1993). Segmenting speech in different languages. The Psychologist, 6(10), 453-455.
  • Cutler, A. (1993). Language-specific processing: Does the evidence converge? In G. T. Altmann, & R. C. Shillcock (Eds.), Cognitive models of speech processing: The Sperlonga Meeting II (pp. 115-123). Hillsdale, NJ: Erlbaum.
  • Cutler, A. (1993). Phonological cues to open- and closed-class words in the processing of spoken sentences. Journal of Psycholinguistic Research, 22, 109-131.

    Abstract

    Evidence is presented that (a) the open and the closed word classes in English have different phonological characteristics, (b) the phonological dimension on which they differ is one to which listeners are highly sensitive, and (c) spoken open- and closed-class words produce different patterns of results in some auditory recognition tasks. What implications might link these findings? Two recent lines of evidence from disparate paradigms—the learning of an artificial language, and natural and experimentally induced misperception of juncture—are summarized, both of which suggest that listeners are sensitive to the phonological reflections of open- vs. closed-class word status. Although these correlates cannot be strictly necessary for efficient processing, if they are present listeners exploit them in making word class assignments. That such a use of phonological information is of value to listeners could be indirect evidence that open- vs. closed-class words undergo different processing operations. Parts of the research reported in this paper were carried out in collaboration with Sally Butterfield and David Carter, and supported by the Alvey Directorate (United Kingdom). Jonathan Stankler's master's research was supported by the Science and Engineering Research Council (United Kingdom). Thanks to all of the above, and to Merrill Garrett, Mike Kelly, James McQueen, and Dennis Norris for further assistance.
  • Cutler, A., & Mehler, J. (1993). The periodicity bias. Journal of Phonetics, 21, 101-108.
  • Jusczyk, P. W., Cutler, A., & Redanz, N. J. (1993). Infants’ preference for the predominant stress patterns of English words. Child Development, 64, 675-687. Retrieved from http://www.jstor.org/stable/1131210.

    Abstract

    One critical aspect of language acquisition is the development of a lexicon that associates sounds and meanings; but developing a lexicon first requires that the infant segment utterances into individual words. How might the infant begin this process? The present study was designed to examine the potential role that sensitivity to predominant stress patterns of words might play in lexical development. In English, by far the majority of words have stressed (strong) initial syllables. Experiment 1 of our study demonstrated that by 9 months of age American infants listen significantly longer to words with strong/weak stress patterns than to words with weak/strong stress patterns. However, Experiment 2 showed that no significant preferences for the predominant stress pattern appear with 6-month-old infants, which suggests that the preference develops as a result of increasing familiarity with the prosodic features of the native language. In a third experiment, 9-month-olds showed a preference for strong/weak patterns even when the speech input was low-pass filtered, which suggests that their preference is specifically for the prosodic structure of the words. Together the results suggest that attention to predominant stress patterns in the native language may form an important part of the infant's process of developing a lexicon.
  • Nix, A. J., Mehta, G., Dye, J., & Cutler, A. (1993). Phoneme detection as a tool for comparing perception of natural and synthetic speech. Computer Speech and Language, 7, 211-228. doi:10.1006/csla.1993.1011.

    Abstract

    On simple intelligibility measures, high-quality synthesiser output now scores almost as well as natural speech. Nevertheless, it is widely agreed that perception of synthetic speech is a harder task for listeners than perception of natural speech; in particular, it has been hypothesized that listeners have difficulty identifying phonemes in synthetic speech. If so, a simple measure of the speed with which a phoneme can be identified should prove a useful tool for comparing perception of synthetic and natural speech. The phoneme detection task was here used in three experiments comparing perception of natural and synthetic speech. In the first, response times to synthetic and natural targets were not significantly different, but in the second and third experiments response times to synthetic targets were significantly slower than to natural targets. A speed-accuracy tradeoff in the third experiment suggests that an important factor in this task is the response criterion adopted by subjects. It is concluded that the phoneme detection task is a useful tool for investigating phonetic processing of synthetic speech input, but subjects must be encouraged to adopt a response criterion which emphasizes rapid responding. When this is the case, significantly longer response times for synthetic targets can indicate a processing disadvantage for synthetic speech at an early level of phonetic analysis.
  • Otake, T., Hatano, G., Cutler, A., & Mehler, J. (1993). Mora or syllable? Speech segmentation in Japanese. Journal of Memory and Language, 32, 258-278. doi:10.1006/jmla.1993.1014.

    Abstract

    Four experiments examined segmentation of spoken Japanese words by native and non-native listeners. Previous studies suggested that language rhythm determines the segmentation unit most natural to native listeners: French has syllabic rhythm, and French listeners use the syllable in segmentation, while English has stress rhythm, and segmentation by English listeners is based on stress. The rhythm of Japanese is based on a subsyllabic unit, the mora. In the present experiments Japanese listeners′ response patterns were consistent with moraic segmentation; acoustic artifacts could not have determined the results since nonnative (English and French) listeners showed different response patterns with the same materials. Predictions of a syllabic hypothesis were disconfirmed in the Japanese listeners′ results; in contrast, French listeners showed a pattern of responses consistent with the syllabic hypothesis. The results provide further evidence that listeners′ segmentation of spoken words relies on procedures determined by the characteristic phonology of their native language.
  • Van Ooijen, B., Cutler, A., & Berinetto, P. M. (1993). Click detection in Italian and English. In Eurospeech 93: Vol. 1 (pp. 681-684). Berlin: ESCA.

    Abstract

    We report four experiments in which English and Italian monolinguals detected clicks in continous speech in their native language. Two of the experiments used an off-line location task, and two used an on-line reaction time task. Despite there being large differences between English and Italian with respect to rhythmic characteristics, very similar response patterns were found for the two language groups. It is concluded that the process of click detection operates independently from language-specific differences in perceptual processing at the sublexical level.
  • Young, D., Altmann, G. T., Cutler, A., & Norris, D. (1993). Metrical structure and the perception of time-compressed speech. In Eurospeech 93: Vol. 2 (pp. 771-774).

    Abstract

    In the absence of explicitly marked cues to word boundaries, listeners tend to segment spoken English at the onset of strong syllables. This may suggest that under difficult listening conditions, speech should be easier to recognize where strong syllables are word-initial. We report two experiments in which listeners were presented with sentences which had been time-compressed to make listening difficult. The first study contrasted sentences in which all content words began with strong syllables with sentences in which all content words began with weak syllables. The intelligibility of the two groups of sentences did not differ significantly. Apparent rhythmic effects in the results prompted a second experiment; however, no significant effects of systematic rhythmic manipulation were observed. In both experiments, the strongest predictor of intelligibility was the rated plausibility of the sentences. We conclude that listeners' recognition responses to time-compressed speech may be strongly subject to experiential bias; effects of rhythmic structure are most likely to show up also as bias effects.
  • Cutler, A., Mehler, J., Norris, D., & Segui, J. (1983). A language-specific comprehension strategy [Letters to Nature]. Nature, 304, 159-160. doi:10.1038/304159a0.

    Abstract

    Infants acquire whatever language is spoken in the environment into which they are born. The mental capability of the newborn child is not biased in any way towards the acquisition of one human language rather than another. Because psychologists who attempt to model the process of language comprehension are interested in the structure of the human mind, rather than in the properties of individual languages, strategies which they incorporate in their models are presumed to be universal, not language-specific. In other words, strategies of comprehension are presumed to be characteristic of the human language processing system, rather than, say, the French, English, or Igbo language processing systems. We report here, however, on a comprehension strategy which appears to be used by native speakers of French but not by native speakers of English.
  • Cutler, A. (1983). Lexical complexity and sentence processing. In G. B. Flores d'Arcais, & R. J. Jarvella (Eds.), The process of language understanding (pp. 43-79). Chichester, Sussex: Wiley.
  • Cutler, A. (1983). Semantics, syntax and sentence accent. In M. Van den Broecke, & A. Cohen (Eds.), Proceedings of the Tenth International Congress of Phonetic Sciences (pp. 85-91). Dordrecht: Foris.
  • Cutler, A., & Ladd, D. R. (Eds.). (1983). Prosody: Models and measurements. Heidelberg: Springer.
  • Cutler, A. (1983). Speakers’ conceptions of the functions of prosody. In A. Cutler, & D. R. Ladd (Eds.), Prosody: Models and measurements (pp. 79-91). Heidelberg: Springer.
  • Ladd, D. R., & Cutler, A. (1983). Models and measurements in the study of prosody. In A. Cutler, & D. R. Ladd (Eds.), Prosody: Models and measurements (pp. 1-10). Heidelberg: Springer.
  • Levelt, W. J. M., & Cutler, A. (1983). Prosodic marking in speech repair. Journal of semantics, 2, 205-217. doi:10.1093/semant/2.2.205.

    Abstract

    Spontaneous self-corrections in speech pose a communication problem; the speaker must make clear to the listener not only that the original Utterance was faulty, but where it was faulty and how the fault is to be corrected. Prosodic marking of corrections - making the prosody of the repair noticeably different from that of the original utterance - offers a resource which the speaker can exploit to provide the listener with such information. A corpus of more than 400 spontaneous speech repairs was analysed, and the prosodic characteristics compared with the syntactic and semantic characteristics of each repair. Prosodic marking showed no relationship at all with the syntactic characteristics of repairs. Instead, marking was associated with certain semantic factors: repairs were marked when the original utterance had been actually erroneous, rather than simply less appropriate than the repair; and repairs tended to be marked more often when the set of items encompassing the error and the repair was small rather than when it was large. These findings lend further weight to the characterization of accent as essentially semantic in function.

Share this page