Publications

Displaying 1 - 28 of 28
  • Ernestus, M. (2012). Segmental within-speaker variation. In A. C. Cohn, C. Fougeron, & M. K. Huffman (Eds.), The Oxford handbook of laboratory phonology (pp. 93-102). New York: Oxford University Press.
  • Hanique, I., & Ernestus, M. (2012). The processes underlying two frequent casual speech phenomena in Dutch: A production experiment. In Proceedings of INTERSPEECH 2012: 13th Annual Conference of the International Speech Communication Association (pp. 2011-2014).

    Abstract

    This study investigated whether a shadowing task can provide insights in the nature of reduction processes that are typical of casual speech. We focused on the shortening and presence versus absence of schwa and /t/ in Dutch past participles. Results showed that the absence of these segments was affected by the same variables as their shortening, suggesting that absence mostly resulted from extreme gradient shortening. This contrasts with results based on recordings of spontaneous conversations. We hypothesize that this difference is due to non-casual fast speech elicited by a shadowing task.
  • Hanique, I., & Ernestus, M. (2012). The role of morphology in acoustic reduction. Lingue e linguaggio, 2012(2), 147-164. doi:10.1418/38783.

    Abstract

    This paper examines the role of morphological structure in the reduced pronunciation of morphologically complex words by discussing and re-analyzing data from the literature. Acoustic reduction refers to the phenomenon that, in spontaneous speech, phonemes may be shorter or absent. We review studies investigating effects of the repetition of a morpheme, of whether a segment plays a crucial role in the identification of its morpheme, and of a word's morphological decomposability. We conclude that these studies report either no effects of morphological structure or effects that are open to alternative interpretations. Our analysis also reveals the need for a uniform definition of morphological decomposability. Furthermore, we examine whether the reduction of segments in morphologically complex words correlates with these segments' contribution to the identification of the whole word, and discuss previous studies and new analyses supporting this hypothesis. We conclude that the data show no convincing evidence that morphological structure conditions reduction, which contrasts with the expectations of several models of speech production and of morphological processing (e.g., weaver++ and dual-route models). The data collected so far support psycholinguistic models which assume that all morphologically complex words are processed as complete units.
  • Schuppler, B., van Dommelen, W. A., Koreman, J., & Ernestus, M. (2012). How linguistic and probabilistic properties of a word affect the realization of its final /t/: Studies at the phonemic and sub-phonemic level. Journal of Phonetics, 40, 595-607. doi:10.1016/j.wocn.2012.05.004.

    Abstract

    This paper investigates the realization of word-final /t/ in conversational standard Dutch. First, based on a large number of word tokens (6747) annotated with broad phonetic transcription by an automatic transcription tool, we show that morphological properties of the words and their position in the utterance's syntactic structure play a role for the presence versus absence of their final /t/. We also replicate earlier findings on the role of predictability (word frequency and bigram frequency with the following word) and provide a detailed analysis of the role of segmental context. Second, we analyze the detailed acoustic properties of word-final /t/ on the basis of a smaller number of tokens (486) which were annotated manually. Our data show that word and bigram frequency as well as segmental context also predict the presence of sub-phonemic properties. The investigations presented in this paper extend research on the realization of /t/ in spontaneous speech and have potential consequences for psycholinguistic models of speech production and perception as well as for automatic speech recognition systems.
  • Torreira, F., & Ernestus, M. (2012). Weakening of intervocalic /s/ in the Nijmegen Corpus of Casual Spanish. Phonetica, 69, 124-148. doi:10.1159/000343635.
  • Van de Ven, M., Ernestus, M., & Schreuder, R. (2012). Predicting acoustically reduced words in spontaneous speech: The role of semantic/syntactic and acoustic cues in context. Laboratory Phonology, 3, 455-481. doi:10.1515/lp-2012-0020.

    Abstract

    In spontaneous speech, words may be realised shorter than in formal speech (e.g., English yesterday may be pronounced like [jɛʃeɩ]). Previous research has shown that context is required to understand highly reduced pronunciation variants. We investigated the extent to which listeners can predict low predictability reduced words on the basis of the semantic/syntactic and acoustic cues in their context. In four experiments, participants were presented with either the preceding context or the preceding and following context of reduced words, and either heard these fragments of conversational speech, or read their orthographic transcriptions. Participants were asked to predict the missing reduced word on the basis of the context alone, choosing from four plausible options. Participants made use of acoustic cues in the context, although casual speech typically has a high speech rate, and acoustic cues are much more unclear than in careful speech. Moreover, they relied on semantic/syntactic cues. Whenever there was a conflict between acoustic and semantic/syntactic contextual cues, measured as the word's probability given the surrounding words, listeners relied more heavily on acoustic cues. Further, context appeared generally insufficient to predict the reduced words, underpinning the significance of the acoustic characteristics of the reduced words themselves.
  • Viebahn, M. C., Ernestus, M., & McQueen, J. M. (2012). Co-occurrence of reduced word forms in natural speech. In Proceedings of INTERSPEECH 2012: 13th Annual Conference of the International Speech Communication Association (pp. 2019-2022).

    Abstract

    This paper presents a corpus study that investigates the co-occurrence of reduced word forms in natural speech. We extracted Dutch past participles from three different speech registers and investigated the influence of several predictor variables on the presence and duration of schwas in prefixes and /t/s in suffixes. Our results suggest that reduced word forms tend to co-occur even if we partial out the effect of speech rate. The implications of our findings for episodic and abstractionist models of lexical representation are discussed.
  • Bürki, A., Ernestus, M., & Frauenfelder, U. H. (2010). Is there only one "fenêtre" in the production lexicon? On-line evidence on the nature of phonological representations of pronunciation variants for French schwa words. Journal of Memory and Language, 62, 421-437. doi:10.1016/j.jml.2010.01.002.

    Abstract

    This study examines whether the production of words with two phonological variants involves single or multiple lexical phonological representations. Three production experiments investigated the roles of the relative frequencies of the two pronunciation variants of French words with schwa: the schwa variant (e.g., Image ) and the reduced variant (e.g., Image ). In two naming tasks and in a symbol–word association learning task, variants with higher relative frequencies were produced faster. This suggests that the production lexicon keeps a frequency count for each variant and hence that schwa words are represented in the production lexicon with two different lexemes. In addition, the advantage for schwa variants over reduced variants in the naming tasks but not in the learning task and the absence of a variant relative frequency effect for schwa variants produced in isolation support the hypothesis that context affects the variants’ lexical activation and modulates the effect of variant relative frequency.
  • Hanique, I., Schuppler, B., & Ernestus, M. (2010). Morphological and predictability effects on schwa reduction: The case of Dutch word-initial syllables. In Proceedings of the 11th Annual Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan (pp. 933-936).

    Abstract

    This corpus-based study shows that the presence and duration of schwa in Dutch word-initial syllables are affected by a word’s predictability and its morphological structure. Schwa is less reduced in words that are more predictable given the following word. In addition, schwa may be longer if the syllable forms a prefix, and in prefixes the duration of schwa is positively correlated with the frequency of the word relative to its stem. Our results suggest that the conditions which favor reduced realizations are more complex than one would expect on the basis of the current literature.
  • Kuzla, C., Ernestus, M., & Mitterer, H. (2010). Compensation for assimilatory devoicing and prosodic structure in German fricative perception. In C. Fougeron, B. Kühnert, M. D'Imperio, & N. Vallée (Eds.), Laboratory Phonology 10 (pp. 731-757). Berlin: De Gruyter.
  • Pluymaekers, M., Ernestus, M., Baayen, R. H., & Booij, G. (2010). Morphological effects on fine phonetic detail: The case of Dutch -igheid. In C. Fougeron, B. Kühnert, M. D'Imperio, & N. Vallée (Eds.), Laboratory Phonology 10 (pp. 511-532). Berlin: De Gruyter.
  • Scharenborg, O., Wan, V., & Ernestus, M. (2010). Unsupervised speech segmentation: An analysis of the hypothesized phone boundaries. Journal of the Acoustical Society of America, 127, 1084-1095. doi:10.1121/1.3277194.

    Abstract

    Despite using different algorithms, most unsupervised automatic phone segmentation methods achieve similar performance in terms of percentage correct boundary detection. Nevertheless, unsupervised segmentation algorithms are not able to perfectly reproduce manually obtained reference transcriptions. This paper investigates fundamental problems for unsupervised segmentation algorithms by comparing a phone segmentation obtained using only the acoustic information present in the signal with a reference segmentation created by human transcribers. The analyses of the output of an unsupervised speech segmentation method that uses acoustic change to hypothesize boundaries showed that acoustic change is a fairly good indicator of segment boundaries: over two-thirds of the hypothesized boundaries coincide with segment boundaries. Statistical analyses showed that the errors are related to segment duration, sequences of similar segments, and inherently dynamic phones. In order to improve unsupervised automatic speech segmentation, current one-stage bottom-up segmentation methods should be expanded into two-stage segmentation methods that are able to use a mix of bottom-up information extracted from the speech signal and automatically derived top-down information. In this way, unsupervised methods can be improved while remaining flexible and language-independent.
  • Schuppler, B., Ernestus, M., Van Dommelen, W., & Koreman, J. (2010). Predicting human perception and ASR classification of word-final [t] by its acoustic sub-segmental properties. In Proceedings of the 11th Annual Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan (pp. 2466-2469).

    Abstract

    This paper presents a study on the acoustic sub-segmental properties of word-final /t/ in conversational standard Dutch and how these properties contribute to whether humans and an ASR system classify the /t/ as acoustically present or absent. In general, humans and the ASR system use the same cues (presence of a constriction, a burst, and alveolar frication), but the ASR system is also less sensitive to fine cues (weak bursts, smoothly starting friction) than human listeners and misled by the presence of glottal vibration. These data inform the further development of models of human and automatic speech processing.
  • Sikveland, A., Öttl, A., Amdal, I., Ernestus, M., Svendsen, T., & Edlund, J. (2010). Spontal-N: A Corpus of Interactional Spoken Norwegian. In N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, & D. Tapias (Eds.), Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10) (pp. 2986-2991). Paris: European Language Resources Association (ELRA).

    Abstract

    Spontal-N is a corpus of spontaneous, interactional Norwegian. To our knowledge, it is the first corpus of Norwegian in which the majority of speakers have spent significant parts of their lives in Sweden, and in which the recorded speech displays varying degrees of interference from Swedish. The corpus consists of studio quality audio- and video-recordings of four 30-minute free conversations between acquaintances, and a manual orthographic transcription of the entire material. On basis of the orthographic transcriptions, we automatically annotated approximately 50 percent of the material on the phoneme level, by means of a forced alignment between the acoustic signal and pronunciations listed in a dictionary. Approximately seven percent of the automatic transcription was manually corrected. Taking the manual correction as a gold standard, we evaluated several sources of pronunciation variants for the automatic transcription. Spontal-N is intended as a general purpose speech resource that is also suitable for investigating phonetic detail.
  • Spilková, H., Brenner, D., Öttl, A., Vondřička, P., Van Dommelen, W., & Ernestus, M. (2010). The Kachna L1/L2 picture replication corpus. In N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, & D. Tapias (Eds.), Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10) (pp. 2432-2436). Paris: European Language Resources Association (ELRA).

    Abstract

    This paper presents the Kachna corpus of spontaneous speech, in which ten Czech and ten Norwegian speakers were recorded both in their native language and in English. The dialogues are elicited using a picture replication task that requires active cooperation and interaction of speakers by asking them to produce a drawing as close to the original as possible. The corpus is appropriate for the study of interactional features and speech reduction phenomena across native and second languages. The combination of productions in non-native English and in speakers’ native language is advantageous for investigation of L2 issues while providing a L1 behaviour reference from all the speakers. The corpus consists of 20 dialogues comprising 12 hours 53 minutes of recording, and was collected in 2008. Preparation of the transcriptions, including a manual orthographic transcription and an automatically generated phonetic transcription, is currently in progress. The phonetic transcriptions are automatically generated by aligning acoustic models with the speech signal on the basis of the orthographic transcriptions and a dictionary of pronunciation variants compiled for the relevant language. Upon completion the corpus will be made available via the European Language Resources Association (ELRA).
  • Torreira, F., & Ernestus, M. (2010). Phrase-medial vowel devoicing in spontaneous French. In Proceedings of the 11th Annual Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan (pp. 2006-2009).

    Abstract

    This study investigates phrase-medial vowel devoicing in European French (e.g. /ty po/ [typo] 'you can'). Our spontaneous speech data confirm that French phrase-medial devoicing is a frequent phenomenon affecting high vowels preceded by voiceless consonants. We also found that devoicing is more frequent in temporally reduced and coarticulated vowels. Complete and partial devoicing were conditioned by the same variables (speech rate, consonant type and distance from the end of the AP). Given these results, we propose that phrase-medial vowel devoicing in French arises mainly from the temporal compression of vocalic gestures and the aerodynamic conditions imposed by high vowels.
  • Torreira, F., Adda-Decker, M., & Ernestus, M. (2010). The Nijmegen corpus of casual French. Speech Communication, 52, 201-212. doi:10.1016/j.specom.2009.10.004.

    Abstract

    This article describes the preparation, recording and orthographic transcription of a new speech corpus, the Nijmegen Corpus of Casual French (NCCFr). The corpus contains a total of over 36 h of recordings of 46 French speakers engaged in conversations with friends. Casual speech was elicited during three different parts, which together provided around 90 min of speech from every pair of speakers. While Parts 1 and 2 did not require participants to perform any specific task, in Part 3 participants negotiated a common answer to general questions about society. Comparisons with the ESTER corpus of journalistic speech show that the two corpora contain speech of considerably different registers. A number of indicators of casualness, including swear words, casual words, verlan, disfluencies and word repetitions, are more frequent in the NCCFr than in the ESTER corpus, while the use of double negation, an indicator of formal speech, is less frequent. In general, these estimates of casualness are constant through the three parts of the recording sessions and across speakers. Based on these facts, we conclude that our corpus is a rich resource of highly casual speech, and that it can be effectively exploited by researchers in language science and technology.

    Files private

    Request files
  • Torreira, F., & Ernestus, M. (2010). The Nijmegen corpus of casual Spanish. In N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, & D. Tapias (Eds.), Proceedings of the Seventh Conference on International Language Resources and Evaluation (LREC'10) (pp. 2981-2985). Paris: European Language Resources Association (ELRA).

    Abstract

    This article describes the preparation, recording and orthographic transcription of a new speech corpus, the Nijmegen Corpus of Casual Spanish (NCCSp). The corpus contains around 30 hours of recordings of 52 Madrid Spanish speakers engaged in conversations with friends. Casual speech was elicited during three different parts, which together provided around ninety minutes of speech from every group of speakers. While Parts 1 and 2 did not require participants to perform any specific task, in Part 3 participants negotiated a common answer to general questions about society. Information about how to obtain a copy of the corpus can be found online at http://mirjamernestus.ruhosting.nl/Ernestus/NCCSp
  • Van de Ven, M., Tucker, B. V., & Ernestus, M. (2010). Semantic facilitation in bilingual everyday speech comprehension. In Proceedings of the 11th Annual Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan (Interspeech 2010), Makuhari, Japan (pp. 1245-1248).

    Abstract

    Previous research suggests that bilinguals presented with low and high predictability sentences benefit from semantics in clear but not in conversational speech [1]. In everyday speech, however, many words are not highly predictable. Previous research has shown that native listeners can use also more subtle semantic contextual information [2]. The present study reports two auditory lexical decision experiments investigating to what extent late Asian-English bilinguals benefit from subtle semantic cues in their processing of English unreduced and reduced speech. Our results indicate that these bilinguals are less sensitive to semantic cues than native listeners for both speech registers.
  • Baayen, H., Levelt, W. J. M., Schreuder, R., & Ernestus, M. (2007). Paradigmatic structure in speech production. Proceedings from the Annual Meeting of the Chicago Linguistic Society, 43(1), 1-29.

    Abstract

    The main goal of the present study is to trace the consequences of local and global markedness for the processing of singular and plural nouns. Decompositional models such as proposed by (Pinker (1997); Pinker (1999)) and (Levelt et al. (1999)) predict a lexeme frequency effect and no effects of the frequencies of the singular and the plural forms. Experiments 1 and 4 reveal the expected lexeme frequency effect. Furthermore, in these experiments there are no clear independent effects of the frequencies of the inflected forms. However, the effects of Entropy and Relative Entropy that emerge from these experiments show that in production knowledge of the probabilities of the individual inflected forms do play a role, albeit indirectly. These entropy effects bear witness to the importance of paradigmatic organization of inflected forms in the mental lexicon, both at the level of individual lexemes (Entropy) and at the general level of the class of nouns (Relative Entropy).
  • Ernestus, M., Van Mulken, M., & Baayen, R. H. (2007). Ridders en heiligen in tijd en ruimte: Moderne stylometrische technieken toegepast op Oud-Franse teksten. Taal en Tongval, 58, 1-83.

    Abstract

    This article shows that Old-French literary texts differ systematically in their relative frequencies of syntactic constructions. These frequencies reflect differences in register (poetry versus prose), region (Picardy, Champagne, and Esatern France), time period (until 1250, 1251 – 1300, 1301 – 1350), and genre (hagiography, romance of chivalry, or other).
  • Ernestus, M., & Baayen, R. H. (2007). Paradigmatic effects in auditory word recognition: The case of alternating voice in Dutch. Language and Cognitive Processes, 22(1), 1-24. doi:10.1080/01690960500268303.

    Abstract

    Two lexical decision experiments addressed the role of paradigmatic effects in auditory word recognition. Experiment 1 showed that listeners classified a form with an incorrectly voiced final obstruent more readily as a word if the obstruent is realised as voiced in other forms of that word's morphological paradigm. Moreover, if such was the case, the exact probability of paradigmatic voicing emerged as a significant predictor of the response latencies. A greater probability of voicing correlated with longer response latencies for words correctly realised with voiceless final obstruents. A similar effect of this probability was observed in Experiment 2 for words with completely voiceless or weakly voiced (incompletely neutralised) final obstruents. These data demonstrate the relevance of paradigmatically related complex words for the processing of morphologically simple words in auditory word recognition.
  • Ernestus, M., & Baayen, R. H. (2007). The comprehension of acoustically reduced morphologically complex words: The roles of deletion, duration, and frequency of occurence. In J. Trouvain, & W. J. Barry (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhs 2007) (pp. 773-776). Dudweiler: Pirrot.

    Abstract

    This study addresses the roles of segment deletion, durational reduction, and frequency of use in the comprehension of morphologically complex words. We report two auditory lexical decision experiments with reduced and unreduced prefixed Dutch words. We found that segment deletions as such delayed comprehension. Simultaneously, however, longer durations of the different parts of the words appeared to increase lexical competition, either from the word’s stem (Experiment 1) or from the word’s morphological continuation forms (Experiment 2). Increased lexical competition slowed down especially the comprehension of low frequency words, which shows that speakers do not try to meet listeners’ needs when they reduce especially high frequency words.
  • Ernestus, M., & Baayen, R. H. (2007). Intraparadigmatic effects on the perception of voice. In J. van de Weijer, & E. J. van der Torre (Eds.), Voicing in Dutch: (De)voicing-phonology, phonetics, and psycholinguistics (pp. 153-173). Amsterdam: Benjamins.

    Abstract

    In Dutch, all morpheme-final obstruents are voiceless in word-final position. As a consequence, the distinction between obstruents that are voiced before vowel-initial suffixes and those that are always voiceless is neutralized. This study adds to the existing evidence that the neutralization is incomplete: neutralized, alternating plosives tend to have shorter bursts than non-alternating plosives. Furthermore, in a rating study, listeners scored the alternating plosives as more voiced than the nonalternating plosives, showing sensitivity to the subtle subphonemic cues in the acoustic signal. Importantly, the participants who were presented with the complete words, instead of just the final rhymes, scored the alternating plosives as even more voiced. This shows that listeners’ perception of voice is affected by their knowledge of the obstruent’s realization in the word’s morphological paradigm. Apparently, subphonemic paradigmatic levelling is a characteristic of both production and perception. We explain the effects within an analogy-based approach.
  • Kuperman, V., Pluymaekers, M., Ernestus, M., & Baayen, R. H. (2007). Morphological predictability and acoustic duration of interfixes in Dutch compounds. Journal of the Acoustical Society of America, 121(4), 2261-2271. doi:10.1121/1.2537393.

    Abstract

    This study explores the effects of informational redundancy, as carried by a word's morphological paradigmatic structure, on acoustic duration in read aloud speech. The hypothesis that the more predictable a linguistic unit is, the less salient its realization, was tested on the basis of the acoustic duration of interfixes in Dutch compounds in two datasets: One for the interfix -s- (1155 tokens) and one for the interfix -e(n)- (742 tokens). Both datasets show that the more probable the interfix is, given the compound and its constituents, the longer it is realized. These findings run counter to the predictions of information-theoretical approaches and can be resolved by the Paradigmatic Signal Enhancement Hypothesis. This hypothesis argues that whenever selection of an element from alternatives is probabilistic, the element's duration is predicted by the amount of paradigmatic support for the element: The most likely alternative in the paradigm of selection is realized longer.
  • Kuzla, C., & Ernestus, M. (2007). Prosodic conditioning of phonetic detail of German plosives. In J. Trouvain, & W. J. Barry (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) (pp. 461-464). Dudweiler: Pirrot.

    Abstract

    The present study investigates the influence of prosodic structure on the fine-grained phonetic details of German plosives which also cue the phonological fortis-lenis contrast. Closure durations were found to be longer at higher prosodic boundaries. There was also less glottal vibration in lenis plosives at higher prosodic boundaries. Voice onset time in lenis plosives was not affected by prosody. In contrast, for the fortis plosives VOT decreased at higher boundaries, as did the maximal intensity of the release. These results demonstrate that the effects of prosody on different phonetic cues can go into opposite directions, but are overall constrained by the need to maintain phonological contrasts. While prosodic effects on some cues are compatible with a ‘fortition’ account of prosodic strengthening or with a general feature enhancement explanation, the effects on others enhance paradigmatic contrasts only within a given prosodic position.
  • Kuzla, C., Cho, T., & Ernestus, M. (2007). Prosodic strengthening of German fricatives in duration and assimilatory devoicing. Journal of Phonetics, 35(3), 301-320. doi:10.1016/j.wocn.2006.11.001.

    Abstract

    This study addressed prosodic effects on the duration of and amount of glottal vibration in German word-initial fricatives /f, v, z/ in assimilatory and non-assimilatory devoicing contexts. Fricatives following /small schwa/ (non-assimilation context) were longer and were produced with less glottal vibration after higher prosodic boundaries, reflecting domain-initial prosodic strengthening. After /t/ (assimilation context), lenis fricatives (/v, z/) were produced with less glottal vibration than after /small schwa/, due to assimilatory devoicing. This devoicing was especially strong across lower prosodic boundaries, showing the influence of prosodic structure on sandhi processes. Reduction in glottal vibration made lenis fricatives more fortis-like (/f, s/). Importantly, fricative duration, another major cue to the fortis-lenis distinction, was affected by initial lengthening, but not by assimilation. Hence, at smaller boundaries, fricatives were more devoiced (more fortis-like), but also shorter (more lenis-like). As a consequence, the fortis and lenis fricatives remained acoustically distinct in all prosodic and segmental contexts. Overall, /z/ was devoiced to a greater extent than /v/. Since /z/ does not have a fortis counterpart in word-initial position, these findings suggest that phonotactic restrictions constrain phonetic processes. The present study illuminates a complex interaction of prosody, sandhi processes, and phonotactics, yielding systematic phonetic cues to prosodic structure and phonological distinctions.
  • Scharenborg, O., Ernestus, M., & Wan, V. (2007). Segmentation of speech: Child's play? In H. van Hamme, & R. van Son (Eds.), Proceedings of Interspeech 2007 (pp. 1953-1956). Adelaide: Causal Productions.

    Abstract

    The difficulty of the task of segmenting a speech signal into its words is immediately clear when listening to a foreign language; it is much harder to segment the signal into its words, since the words of the language are unknown. Infants are faced with the same task when learning their first language. This study provides a better understanding of the task that infants face while learning their native language. We employed an automatic algorithm on the task of speech segmentation without prior knowledge of the labels of the phonemes. An analysis of the boundaries erroneously placed inside a phoneme showed that the algorithm consistently placed additional boundaries in phonemes in which acoustic changes occur. These acoustic changes may be as great as the transition from the closure to the burst of a plosive or as subtle as the formant transitions in low or back vowels. Moreover, we found that glottal vibration may attenuate the relevance of acoustic changes within obstruents. An interesting question for further research is how infants learn to overcome the natural tendency to segment these ‘dynamic’ phonemes.

Share this page