Publications

Displaying 1 - 22 of 22
  • Ernestus, M. (2014). Acoustic reduction and the roles of abstractions and exemplars in speech processing. Lingua, 142, 27-41. doi:10.1016/j.lingua.2012.12.006.

    Abstract

    Acoustic reduction refers to the frequent phenomenon in conversational speech that words are produced with fewer or lenited segments compared to their citation forms. The few published studies on the production and comprehension of acoustic reduction have important implications for the debate on the relevance of abstractions and exemplars in speech processing. This article discusses these implications. It first briefly introduces the key assumptions of simple abstractionist and simple exemplar-based models. It then discusses the literature on acoustic reduction and draws the conclusion that both types of models need to be extended to explain all findings. The ultimate model should allow for the storage of different pronunciation variants, but also reserve an important role for phonetic implementation. Furthermore, the recognition of a highly reduced pronunciation variant requires top down information and leads to activation of the corresponding unreduced variant, the variant that reaches listeners’ consciousness. These findings are best accounted for in hybrids models, assuming both abstract representations and exemplars. None of the hybrid models formulated so far can account for all data on reduced speech and we need further research for obtaining detailed insight into how speakers produce and listeners comprehend reduced speech.
  • Ernestus, M., & Giezenaar, G. (2014). Een goed verstaander heeft maar een half woord nodig. In B. Bossers (Ed.), Vakwerk 9: Achtergronden van de NT2-lespraktijk: Lezingen conferentie Hoeven 2014 (pp. 81-92). Amsterdam: BV NT2.
  • Ernestus, M., Kočková-Amortová, L., & Pollak, P. (2014). The Nijmegen corpus of casual Czech. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 365-370).

    Abstract

    This article introduces a new speech corpus, the Nijmegen Corpus of Casual Czech (NCCCz), which contains more than 30 hours of high-quality recordings of casual conversations in Common Czech, among ten groups of three male and ten groups of three female friends. All speakers were native speakers of Czech, raised in Prague or in the region of Central Bohemia, and were between 19 and 26 years old. Every group of speakers consisted of one confederate, who was instructed to keep the conversations lively, and two speakers naive to the purposes of the recordings. The naive speakers were engaged in conversations for approximately 90 minutes, while the confederate joined them for approximately the last 72 minutes. The corpus was orthographically annotated by experienced transcribers and this orthographic transcription was aligned with the speech signal. In addition, the conversations were videotaped. This corpus can form the basis for all types of research on casual conversations in Czech, including phonetic research and research on how to improve automatic speech recognition. The corpus will be freely available
  • Lahey, M., & Ernestus, M. (2014). Pronunciation variation in infant-directed speech: Phonetic reduction of two highly frequent words. Language Learning and Development, 10, 308-327. doi:10.1080/15475441.2013.860813.

    Abstract

    In spontaneous conversations between adults, words are often pronounced with fewer segments or syllables than their citation forms. The question arises whether infant-directed speech also contains phonetic reduction. If so, infants would be presented with speech input that enables them to acquire reduced variants from an early age. This study compared speech directed at 11- and 12-month-old infants with adult-directed conversational speech and adult-directed read speech. In an acoustic study, 216 tokens of the Dutch words allemaal and helemaal from speech corpora were analyzed for duration, number of syllables, and vowel quality. In a perception study, adult participants rated these same materials for reduction and provided phonetic transcriptions. The results show that these two words are frequently reduced in infant-directed speech, and that their degree of reduction is comparable with conversational adult-directed speech. These findings suggest that lexical representations for reduced pronunciation variants can be acquired early in linguistic development

    Files private

    Request files
  • Mizera, P., Pollak, P., Kolman, A., & Ernestus, M. (2014). Impact of irregular pronunciation on phonetic segmentation of Nijmegen corpus of Casual Czech. In P. Sojka, A. Horák, I. Kopecek, & K. Pala (Eds.), Text, Speech and Dialogue: 17th International Conference, TSD 2014, Brno, Czech Republic, September 8-12, 2014. Proceedings (pp. 499-506). Heidelberg: Springer.

    Abstract

    This paper describes the pilot study of phonetic segmentation applied to Nijmegen Corpus of Casual Czech (NCCCz). This corpus contains informal speech of strong spontaneous nature which influences the character of produced speech at various levels. This work is the part of wider research related to the analysis of pronunciation reduction in such informal speech. We present the analysis of the accuracy of phonetic segmentation when canonical or reduced pronunciation is used. The achieved accuracy of realized phonetic segmentation provides information about general accuracy of proper acoustic modelling which is supposed to be applied in spontaneous speech recognition. As a byproduct of presented spontaneous speech segmentation, this paper also describes the created lexicon with canonical pronunciations of words in NCCCz, a tool supporting pronunciation check of lexicon items, and finally also a minidatabase of selected utterances from NCCCz manually labelled on phonetic level suitable for evaluation purposes
  • Schertz, J., & Ernestus, M. (2014). Variability in the pronunciation of non-native English the: Effects of frequency and disfluencies. Corpus Linguistics and Linguistic Theory, 10, 329-345. doi:10.1515/cllt-2014-0024.

    Abstract

    This study examines how lexical frequency and planning problems can predict phonetic variability in the function word ‘the’ in conversational speech produced by non-native speakers of English. We examined 3180 tokens of ‘the’ drawn from English conversations between native speakers of Czech or Norwegian. Using regression models, we investigated the effect of following word frequency and disfluencies on three phonetic parameters: vowel duration, vowel quality, and consonant quality. Overall, the non-native speakers showed variation that is very similar to the variation displayed by native speakers of English. Like native speakers, Czech speakers showed an effect of frequency on vowel durations, which were shorter in more frequent word sequences. Both groups of speakers showed an effect of frequency on consonant quality: the substitution of another consonant for /ð/ occurred more often in the context of more frequent words. The speakers in this study also showed a native-like allophonic distinction in vowel quality, in which /ði/ occurs more often before vowels and /ðə/ before consonants. Vowel durations were longer in the presence of following disfluencies, again mirroring patterns in native speakers, and the consonant quality was more likely to be the target /ð/ before disfluencies, as opposed to a different consonant. The fact that non-native speakers show native-like sensitivity to lexical frequency and disfluencies suggests that these effects are consequences of a general, non-language-specific production mechanism governing language planning. On the other hand, the non-native speakers in this study did not show native-like patterns of vowel quality in the presence of disfluencies, suggesting that the pattern attested in native speakers of English may result from language-specific processes separate from the general production mechanisms
  • Ten Bosch, L., Ernestus, M., & Boves, L. (2014). Comparing reaction time sequences from human participants and computational models. In Proceedings of Interspeech 2014: 15th Annual Conference of the International Speech Communication Association (pp. 462-466).

    Abstract

    This paper addresses the question how to compare reaction times computed by a computational model of speech comprehension with observed reaction times by participants. The question is based on the observation that reaction time sequences substantially differ per participant, which raises the issue of how exactly the model is to be assessed. Part of the variation in reaction time sequences is caused by the so-called local speed: the current reaction time correlates to some extent with a number of previous reaction times, due to slowly varying variations in attention, fatigue etc. This paper proposes a method, based on time series analysis, to filter the observed reaction times in order to separate the local speed effects. Results show that after such filtering the between-participant correlations increase as well as the average correlation between participant and model increases. The presented technique provides insights into relevant aspects that are to be taken into account when comparing reaction time sequences
  • Baayen, H., Levelt, W. J. M., Schreuder, R., & Ernestus, M. (2007). Paradigmatic structure in speech production. Proceedings from the Annual Meeting of the Chicago Linguistic Society, 43(1), 1-29.

    Abstract

    The main goal of the present study is to trace the consequences of local and global markedness for the processing of singular and plural nouns. Decompositional models such as proposed by (Pinker (1997); Pinker (1999)) and (Levelt et al. (1999)) predict a lexeme frequency effect and no effects of the frequencies of the singular and the plural forms. Experiments 1 and 4 reveal the expected lexeme frequency effect. Furthermore, in these experiments there are no clear independent effects of the frequencies of the inflected forms. However, the effects of Entropy and Relative Entropy that emerge from these experiments show that in production knowledge of the probabilities of the individual inflected forms do play a role, albeit indirectly. These entropy effects bear witness to the importance of paradigmatic organization of inflected forms in the mental lexicon, both at the level of individual lexemes (Entropy) and at the general level of the class of nouns (Relative Entropy).
  • Ernestus, M., Van Mulken, M., & Baayen, R. H. (2007). Ridders en heiligen in tijd en ruimte: Moderne stylometrische technieken toegepast op Oud-Franse teksten. Taal en Tongval, 58, 1-83.

    Abstract

    This article shows that Old-French literary texts differ systematically in their relative frequencies of syntactic constructions. These frequencies reflect differences in register (poetry versus prose), region (Picardy, Champagne, and Esatern France), time period (until 1250, 1251 – 1300, 1301 – 1350), and genre (hagiography, romance of chivalry, or other).
  • Ernestus, M., & Baayen, R. H. (2007). Paradigmatic effects in auditory word recognition: The case of alternating voice in Dutch. Language and Cognitive Processes, 22(1), 1-24. doi:10.1080/01690960500268303.

    Abstract

    Two lexical decision experiments addressed the role of paradigmatic effects in auditory word recognition. Experiment 1 showed that listeners classified a form with an incorrectly voiced final obstruent more readily as a word if the obstruent is realised as voiced in other forms of that word's morphological paradigm. Moreover, if such was the case, the exact probability of paradigmatic voicing emerged as a significant predictor of the response latencies. A greater probability of voicing correlated with longer response latencies for words correctly realised with voiceless final obstruents. A similar effect of this probability was observed in Experiment 2 for words with completely voiceless or weakly voiced (incompletely neutralised) final obstruents. These data demonstrate the relevance of paradigmatically related complex words for the processing of morphologically simple words in auditory word recognition.
  • Ernestus, M., & Baayen, R. H. (2007). The comprehension of acoustically reduced morphologically complex words: The roles of deletion, duration, and frequency of occurence. In J. Trouvain, & W. J. Barry (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhs 2007) (pp. 773-776). Dudweiler: Pirrot.

    Abstract

    This study addresses the roles of segment deletion, durational reduction, and frequency of use in the comprehension of morphologically complex words. We report two auditory lexical decision experiments with reduced and unreduced prefixed Dutch words. We found that segment deletions as such delayed comprehension. Simultaneously, however, longer durations of the different parts of the words appeared to increase lexical competition, either from the word’s stem (Experiment 1) or from the word’s morphological continuation forms (Experiment 2). Increased lexical competition slowed down especially the comprehension of low frequency words, which shows that speakers do not try to meet listeners’ needs when they reduce especially high frequency words.
  • Ernestus, M., & Baayen, R. H. (2007). Intraparadigmatic effects on the perception of voice. In J. van de Weijer, & E. J. van der Torre (Eds.), Voicing in Dutch: (De)voicing-phonology, phonetics, and psycholinguistics (pp. 153-173). Amsterdam: Benjamins.

    Abstract

    In Dutch, all morpheme-final obstruents are voiceless in word-final position. As a consequence, the distinction between obstruents that are voiced before vowel-initial suffixes and those that are always voiceless is neutralized. This study adds to the existing evidence that the neutralization is incomplete: neutralized, alternating plosives tend to have shorter bursts than non-alternating plosives. Furthermore, in a rating study, listeners scored the alternating plosives as more voiced than the nonalternating plosives, showing sensitivity to the subtle subphonemic cues in the acoustic signal. Importantly, the participants who were presented with the complete words, instead of just the final rhymes, scored the alternating plosives as even more voiced. This shows that listeners’ perception of voice is affected by their knowledge of the obstruent’s realization in the word’s morphological paradigm. Apparently, subphonemic paradigmatic levelling is a characteristic of both production and perception. We explain the effects within an analogy-based approach.
  • Kuperman, V., Pluymaekers, M., Ernestus, M., & Baayen, R. H. (2007). Morphological predictability and acoustic duration of interfixes in Dutch compounds. Journal of the Acoustical Society of America, 121(4), 2261-2271. doi:10.1121/1.2537393.

    Abstract

    This study explores the effects of informational redundancy, as carried by a word's morphological paradigmatic structure, on acoustic duration in read aloud speech. The hypothesis that the more predictable a linguistic unit is, the less salient its realization, was tested on the basis of the acoustic duration of interfixes in Dutch compounds in two datasets: One for the interfix -s- (1155 tokens) and one for the interfix -e(n)- (742 tokens). Both datasets show that the more probable the interfix is, given the compound and its constituents, the longer it is realized. These findings run counter to the predictions of information-theoretical approaches and can be resolved by the Paradigmatic Signal Enhancement Hypothesis. This hypothesis argues that whenever selection of an element from alternatives is probabilistic, the element's duration is predicted by the amount of paradigmatic support for the element: The most likely alternative in the paradigm of selection is realized longer.
  • Kuzla, C., & Ernestus, M. (2007). Prosodic conditioning of phonetic detail of German plosives. In J. Trouvain, & W. J. Barry (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) (pp. 461-464). Dudweiler: Pirrot.

    Abstract

    The present study investigates the influence of prosodic structure on the fine-grained phonetic details of German plosives which also cue the phonological fortis-lenis contrast. Closure durations were found to be longer at higher prosodic boundaries. There was also less glottal vibration in lenis plosives at higher prosodic boundaries. Voice onset time in lenis plosives was not affected by prosody. In contrast, for the fortis plosives VOT decreased at higher boundaries, as did the maximal intensity of the release. These results demonstrate that the effects of prosody on different phonetic cues can go into opposite directions, but are overall constrained by the need to maintain phonological contrasts. While prosodic effects on some cues are compatible with a ‘fortition’ account of prosodic strengthening or with a general feature enhancement explanation, the effects on others enhance paradigmatic contrasts only within a given prosodic position.
  • Kuzla, C., Cho, T., & Ernestus, M. (2007). Prosodic strengthening of German fricatives in duration and assimilatory devoicing. Journal of Phonetics, 35(3), 301-320. doi:10.1016/j.wocn.2006.11.001.

    Abstract

    This study addressed prosodic effects on the duration of and amount of glottal vibration in German word-initial fricatives /f, v, z/ in assimilatory and non-assimilatory devoicing contexts. Fricatives following /small schwa/ (non-assimilation context) were longer and were produced with less glottal vibration after higher prosodic boundaries, reflecting domain-initial prosodic strengthening. After /t/ (assimilation context), lenis fricatives (/v, z/) were produced with less glottal vibration than after /small schwa/, due to assimilatory devoicing. This devoicing was especially strong across lower prosodic boundaries, showing the influence of prosodic structure on sandhi processes. Reduction in glottal vibration made lenis fricatives more fortis-like (/f, s/). Importantly, fricative duration, another major cue to the fortis-lenis distinction, was affected by initial lengthening, but not by assimilation. Hence, at smaller boundaries, fricatives were more devoiced (more fortis-like), but also shorter (more lenis-like). As a consequence, the fortis and lenis fricatives remained acoustically distinct in all prosodic and segmental contexts. Overall, /z/ was devoiced to a greater extent than /v/. Since /z/ does not have a fortis counterpart in word-initial position, these findings suggest that phonotactic restrictions constrain phonetic processes. The present study illuminates a complex interaction of prosody, sandhi processes, and phonotactics, yielding systematic phonetic cues to prosodic structure and phonological distinctions.
  • Scharenborg, O., Ernestus, M., & Wan, V. (2007). Segmentation of speech: Child's play? In H. van Hamme, & R. van Son (Eds.), Proceedings of Interspeech 2007 (pp. 1953-1956). Adelaide: Causal Productions.

    Abstract

    The difficulty of the task of segmenting a speech signal into its words is immediately clear when listening to a foreign language; it is much harder to segment the signal into its words, since the words of the language are unknown. Infants are faced with the same task when learning their first language. This study provides a better understanding of the task that infants face while learning their native language. We employed an automatic algorithm on the task of speech segmentation without prior knowledge of the labels of the phonemes. An analysis of the boundaries erroneously placed inside a phoneme showed that the algorithm consistently placed additional boundaries in phonemes in which acoustic changes occur. These acoustic changes may be as great as the transition from the closure to the burst of a plosive or as subtle as the formant transitions in low or back vowels. Moreover, we found that glottal vibration may attenuate the relevance of acoustic changes within obstruents. An interesting question for further research is how infants learn to overcome the natural tendency to segment these ‘dynamic’ phonemes.
  • Ernestus, M., & Mak, W. M. (2004). Distinctive phonological features differ in relevance for both spoken and written word recognition. Brain and Language, 90(1-3), 378-392. doi:10.1016/S0093-934X(03)00449-8.

    Abstract

    This paper discusses four experiments on Dutch which show that distinctive phonological features differ in their relevance for word recognition. The relevance of a feature for word recognition depends on its phonological stability, that is, the extent to which that feature is generally realized in accordance with its lexical specification in the relevant word position. If one feature value is uninformative, all values of that feature are less relevant for word recognition, with the least informative feature being the least relevant. Features differ in their relevance both in spoken and written word recognition, though the differences are more pronounced in auditory lexical decision than in self-paced reading.
  • Ernestus, M., & Baayen, R. H. (2004). Analogical effects in regular past tense production in Dutch. Linguistics, 42(5), 873-903. doi:10.1515/ling.2004.031.

    Abstract

    This study addresses the question to what extent the production of regular past tense forms in Dutch is a¤ected by analogical processes. We report an experiment in which native speakers of Dutch listened to existing regular verbs over headphones, and had to indicate which of the past tense allomorphs, te or de, was appropriate for these verbs. According to generative analyses, the choice between the two su‰xes is completely regular and governed by the underlying [voice]-specification of the stem-final segment. In this approach, no analogical e¤ects are expected. In connectionist and analogical approaches, by contrast, the phonological similarity structure in the lexicon is expected to a¤ect lexical processing. Our experimental results support the latter approach: all participants created more nonstandard past tense forms, produced more inconsistency errors, and responded more slowly for verbs with stronger analogical support for the nonstandard form.
  • Ernestus, M., & Baayen, R. H. (2004). Kuchde, tobte, en turfte: Lekkage in 't kofschip. Onze Taal, 73(12), 360-361.
  • Kemps, R. J. J. K., Ernestus, M., Schreuder, R., & Baayen, R. H. (2004). Processing reduced word forms: The suffix restoration effect. Brain and Language, 90(1-3), 117-127. doi:10.1016/S0093-934X(03)00425-5.

    Abstract

    Listeners cannot recognize highly reduced word forms in isolation, but they can do so when these forms are presented in context (Ernestus, Baayen, & Schreuder, 2002). This suggests that not all possible surface forms of words have equal status in the mental lexicon. The present study shows that the reduced forms are linked to the canonical representations in the mental lexicon, and that these latter representations induce reconstruction processes. Listeners restore suffixes that are partly or completely missing in reduced word forms. A series of phoneme-monitoring experiments reveals the nature of this restoration: the basis for suffix restoration is mainly phonological in nature, but orthography has an influence as well.
  • Moscoso del Prado Martín, F., Ernestus, M., & Baayen, R. H. (2004). Do type and token effects reflect different mechanisms? Connectionist modeling of Dutch past-tense formation and final devoicing. Brain and Language, 90(1-3), 287-298. doi:10.1016/j.bandl.2003.12.002.

    Abstract

    In this paper, we show that both token and type-based effects in lexical processing can result from a single, token-based, system, and therefore, do not necessarily reflect different levels of processing. We report three Simple Recurrent Networks modeling Dutch past-tense formation. These networks show token-based frequency effects and type-based analogical effects closely matching the behavior of human participants when producing past-tense forms for both existing verbs and pseudo-verbs. The third network covers the full vocabulary of Dutch, without imposing predefined linguistic structure on the input or output words.
  • Rietveld, T., Van Hout, R., & Ernestus, M. (2004). Pitfalls in corpus research. Computers and the Humanities, 38(4), 343-362. doi:10.1007/s10579-004-1919-1.

    Abstract

    This paper discusses some pitfalls in corpus research and suggests solutions on the basis of examples and computer simulations. We first address reliability problems in language transcriptions, agreement between transcribers, and how disagreements can be dealt with. We then show that the frequencies of occurrence obtained from a corpus cannot always be analyzed with the traditional X2 test, as corpus data are often not sequentially independent and unit independent. Next, we stress the relevance of the power of statistical tests, and the sizes of statistically significant effects. Finally, we point out that a t-test based on log odds often provides a better alternative to a X2 analysis based on frequency counts.

Share this page