Publications

Displaying 1 - 19 of 19
  • Ernestus, M. (2014). Acoustic reduction and the roles of abstractions and exemplars in speech processing. Lingua, 142, 27-41. doi:10.1016/j.lingua.2012.12.006.

    Abstract

    Acoustic reduction refers to the frequent phenomenon in conversational speech that words are produced with fewer or lenited segments compared to their citation forms. The few published studies on the production and comprehension of acoustic reduction have important implications for the debate on the relevance of abstractions and exemplars in speech processing. This article discusses these implications. It first briefly introduces the key assumptions of simple abstractionist and simple exemplar-based models. It then discusses the literature on acoustic reduction and draws the conclusion that both types of models need to be extended to explain all findings. The ultimate model should allow for the storage of different pronunciation variants, but also reserve an important role for phonetic implementation. Furthermore, the recognition of a highly reduced pronunciation variant requires top down information and leads to activation of the corresponding unreduced variant, the variant that reaches listeners’ consciousness. These findings are best accounted for in hybrids models, assuming both abstract representations and exemplars. None of the hybrid models formulated so far can account for all data on reduced speech and we need further research for obtaining detailed insight into how speakers produce and listeners comprehend reduced speech.
  • Ernestus, M., & Giezenaar, G. (2014). Een goed verstaander heeft maar een half woord nodig. In B. Bossers (Ed.), Vakwerk 9: Achtergronden van de NT2-lespraktijk: Lezingen conferentie Hoeven 2014 (pp. 81-92). Amsterdam: BV NT2.
  • Ernestus, M., Kočková-Amortová, L., & Pollak, P. (2014). The Nijmegen corpus of casual Czech. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 365-370).

    Abstract

    This article introduces a new speech corpus, the Nijmegen Corpus of Casual Czech (NCCCz), which contains more than 30 hours of high-quality recordings of casual conversations in Common Czech, among ten groups of three male and ten groups of three female friends. All speakers were native speakers of Czech, raised in Prague or in the region of Central Bohemia, and were between 19 and 26 years old. Every group of speakers consisted of one confederate, who was instructed to keep the conversations lively, and two speakers naive to the purposes of the recordings. The naive speakers were engaged in conversations for approximately 90 minutes, while the confederate joined them for approximately the last 72 minutes. The corpus was orthographically annotated by experienced transcribers and this orthographic transcription was aligned with the speech signal. In addition, the conversations were videotaped. This corpus can form the basis for all types of research on casual conversations in Czech, including phonetic research and research on how to improve automatic speech recognition. The corpus will be freely available
  • Lahey, M., & Ernestus, M. (2014). Pronunciation variation in infant-directed speech: Phonetic reduction of two highly frequent words. Language Learning and Development, 10, 308-327. doi:10.1080/15475441.2013.860813.

    Abstract

    In spontaneous conversations between adults, words are often pronounced with fewer segments or syllables than their citation forms. The question arises whether infant-directed speech also contains phonetic reduction. If so, infants would be presented with speech input that enables them to acquire reduced variants from an early age. This study compared speech directed at 11- and 12-month-old infants with adult-directed conversational speech and adult-directed read speech. In an acoustic study, 216 tokens of the Dutch words allemaal and helemaal from speech corpora were analyzed for duration, number of syllables, and vowel quality. In a perception study, adult participants rated these same materials for reduction and provided phonetic transcriptions. The results show that these two words are frequently reduced in infant-directed speech, and that their degree of reduction is comparable with conversational adult-directed speech. These findings suggest that lexical representations for reduced pronunciation variants can be acquired early in linguistic development

    Files private

    Request files
  • Mizera, P., Pollak, P., Kolman, A., & Ernestus, M. (2014). Impact of irregular pronunciation on phonetic segmentation of Nijmegen corpus of Casual Czech. In P. Sojka, A. Horák, I. Kopecek, & K. Pala (Eds.), Text, Speech and Dialogue: 17th International Conference, TSD 2014, Brno, Czech Republic, September 8-12, 2014. Proceedings (pp. 499-506). Heidelberg: Springer.

    Abstract

    This paper describes the pilot study of phonetic segmentation applied to Nijmegen Corpus of Casual Czech (NCCCz). This corpus contains informal speech of strong spontaneous nature which influences the character of produced speech at various levels. This work is the part of wider research related to the analysis of pronunciation reduction in such informal speech. We present the analysis of the accuracy of phonetic segmentation when canonical or reduced pronunciation is used. The achieved accuracy of realized phonetic segmentation provides information about general accuracy of proper acoustic modelling which is supposed to be applied in spontaneous speech recognition. As a byproduct of presented spontaneous speech segmentation, this paper also describes the created lexicon with canonical pronunciations of words in NCCCz, a tool supporting pronunciation check of lexicon items, and finally also a minidatabase of selected utterances from NCCCz manually labelled on phonetic level suitable for evaluation purposes
  • Schertz, J., & Ernestus, M. (2014). Variability in the pronunciation of non-native English the: Effects of frequency and disfluencies. Corpus Linguistics and Linguistic Theory, 10, 329-345. doi:10.1515/cllt-2014-0024.

    Abstract

    This study examines how lexical frequency and planning problems can predict phonetic variability in the function word ‘the’ in conversational speech produced by non-native speakers of English. We examined 3180 tokens of ‘the’ drawn from English conversations between native speakers of Czech or Norwegian. Using regression models, we investigated the effect of following word frequency and disfluencies on three phonetic parameters: vowel duration, vowel quality, and consonant quality. Overall, the non-native speakers showed variation that is very similar to the variation displayed by native speakers of English. Like native speakers, Czech speakers showed an effect of frequency on vowel durations, which were shorter in more frequent word sequences. Both groups of speakers showed an effect of frequency on consonant quality: the substitution of another consonant for /ð/ occurred more often in the context of more frequent words. The speakers in this study also showed a native-like allophonic distinction in vowel quality, in which /ði/ occurs more often before vowels and /ðə/ before consonants. Vowel durations were longer in the presence of following disfluencies, again mirroring patterns in native speakers, and the consonant quality was more likely to be the target /ð/ before disfluencies, as opposed to a different consonant. The fact that non-native speakers show native-like sensitivity to lexical frequency and disfluencies suggests that these effects are consequences of a general, non-language-specific production mechanism governing language planning. On the other hand, the non-native speakers in this study did not show native-like patterns of vowel quality in the presence of disfluencies, suggesting that the pattern attested in native speakers of English may result from language-specific processes separate from the general production mechanisms
  • Ten Bosch, L., Ernestus, M., & Boves, L. (2014). Comparing reaction time sequences from human participants and computational models. In Proceedings of Interspeech 2014: 15th Annual Conference of the International Speech Communication Association (pp. 462-466).

    Abstract

    This paper addresses the question how to compare reaction times computed by a computational model of speech comprehension with observed reaction times by participants. The question is based on the observation that reaction time sequences substantially differ per participant, which raises the issue of how exactly the model is to be assessed. Part of the variation in reaction time sequences is caused by the so-called local speed: the current reaction time correlates to some extent with a number of previous reaction times, due to slowly varying variations in attention, fatigue etc. This paper proposes a method, based on time series analysis, to filter the observed reaction times in order to separate the local speed effects. Results show that after such filtering the between-participant correlations increase as well as the average correlation between participant and model increases. The presented technique provides insights into relevant aspects that are to be taken into account when comparing reaction time sequences
  • Ernestus, M. (2013). Halve woorden [Inaugural lecture]. Nijmegen: Radboud University.

    Abstract

    Rede uitgesproken bij de aanvaarding van het ambt van hoogleraar Psycholinguïstiek aan de Faculteit der Letteren van de Radboud Universiteit Nijmegen op vrijdag 18 januari 2013
  • Hanique, I., Aalders, E., & Ernestus, M. (2013). How robust are exemplar effects in word comprehension? The mental lexicon, 8, 269-294. doi:10.1075/ml.8.3.01han.

    Abstract

    This paper studies the robustness of exemplar effects in word comprehension by means of four long-term priming experiments with lexical decision tasks in Dutch. A prime and target represented the same word type and were presented with the same or different degree of reduction. In Experiment 1, participants heard only a small number of trials, a large proportion of repeated words, and stimuli produced by only one speaker. They recognized targets more quickly if these represented the same degree of reduction as their primes, which forms additional evidence for the exemplar effects reported in the literature. Similar effects were found for two speakers who differ in their pronunciations. In Experiment 2, with a smaller proportion of repeated words and more trials between prime and target, participants recognized targets preceded by primes with the same or a different degree of reduction equally quickly. Also, in Experiments 3 and 4, in which listeners were not exposed to one but two types of pronunciation variation (reduction degree and speaker voice), no exemplar effects arose. We conclude that the role of exemplars in speech comprehension during natural conversations, which typically involve several speakers and few repeated content words, may be smaller than previously assumed.
  • Hanique, I., Ernestus, M., & Schuppler, B. (2013). Informal speech processes can be categorical in nature, even if they affect many different words. Journal of the Acoustical Society of America, 133, 1644-1655. doi:10.1121/1.4790352.

    Abstract

    This paper investigates the nature of reduction phenomena in informal speech. It addresses the question whether reduction processes that affect many word types, but only if they occur in connected informal speech, may be categorical in nature. The focus is on reduction of schwa in the prefixes and on word-final /t/ in Dutch past participles. More than 2000 tokens of past participles from the Ernestus Corpus of Spontaneous Dutch and the Spoken Dutch Corpus (both from the interview and read speech component) were transcribed automatically. The results demonstrate that the presence and duration of /t/ are affected by approximately the same phonetic variables, indicating that the absence of /t/ is the extreme result of shortening, and thus results from a gradient reduction process. Also for schwa, the data show that mainly phonetic variables influence its reduction, but its presence is affected by different and more variables than its duration, which suggests that the absence of schwa may result from gradient as well as categorical processes. These conclusions are supported by the distributions of the segments’ durations. These findings provide evidence that reduction phenomena which affect many words in informal conversations may also result from categorical reduction processes.
  • Johnson, E. K., Lahey, M., Ernestus, M., & Cutler, A. (2013). A multimodal corpus of speech to infant and adult listeners. Journal of the Acoustical Society of America, 134, EL534-EL540. doi:10.1121/1.4828977.

    Abstract

    An audio and video corpus of speech addressed to 28 11-month-olds is described. The corpus allows comparisons between adult speech directed towards infants, familiar adults and unfamiliar adult addressees, as well as of caregivers’ word teaching strategies across word classes. Summary data show that infant-directed speech differed more from speech to unfamiliar than familiar adults; that word teaching strategies for nominals versus verbs and adjectives differed; that mothers mostly addressed infants with multi-word utterances; and that infants’ vocabulary size was unrelated to speech rate, but correlated positively with predominance of continuous caregiver speech (not of isolated words) in the input.
  • De Schryver, J., Neijt, A., Ghesquière, P., & Ernestus, M. (2013). Zij surfde, maar hij durfte niet: De spellingproblematiek van de zwakke verleden tijd in Nederland en Vlaanderen. Dutch Journal of Applied Linguistics, 2(2), 133-151. doi:10.1075/dujal.2.2.01de.

    Abstract

    Hoewel de spelling van Nederlandse verledentijdsvormen van zwakke werkwoorden algemeen als eenvoudig wordt beschouwd (ze zijn immers klankzuiver) maken zelfs universiteitsstudenten opvallend veel fouten bij de keuze tussen de uitgangen -te en -de. Voor een deel zijn die fouten ‘natuurlijk’ in die zin dat ze het gevolg zijn van de werking van frequentie en analogie. Anderzijds stellen we vast dat Nederlanders veel meer fouten maken dan Vlamingen, althans als de stam op een coronale fricatief eindigt (s, z, f, v). Aangezien de Nederlandse proefpersonen de ‘regel’ (het ezelsbruggetje ’t kofschip) beter lijken te beheersen dan de Vlamingen, moet de verklaring voor het verschil gezocht worden in een klankverandering die zich wel in Nederland maar niet of nauwelijks in Vlaanderen voordoet, de verstemlozing van de fricatieven. Het spellingprobleem vraagt om didactische maatregelen en/of politieke: het kan wellicht grotendeels worden opgelost door de spellingregels een weinig aan te passen.
  • Ten Bosch, L., Boves, L., & Ernestus, M. (2013). Towards an end-to-end computational model of speech comprehension: simulating a lexical decision task. In Proceedings of INTERSPEECH 2013: 14th Annual Conference of the International Speech Communication Association (pp. 2822-2826).

    Abstract

    This paper describes a computational model of speech comprehension that takes the acoustic signal as input and predicts reaction times as observed in an auditory lexical decision task. By doing so, we explore a new generation of end-to-end computational models that are able to simulate the behaviour of human subjects participating in a psycholinguistic experiment. So far, nearly all computational models of speech comprehension do not start from the speech signal itself, but from abstract representations of the speech signal, while the few existing models that do start from the acoustic signal cannot directly model reaction times as obtained in comprehension experiments. The main functional components in our model are the perception stage, which is compatible with the psycholinguistic model Shortlist B and is implemented with techniques from automatic speech recognition, and the decision stage, which is based on the linear ballistic accumulation decision model. We successfully tested our model against data from 20 participants performing a largescale auditory lexical decision experiment. Analyses show that the model is a good predictor for the average judgment and reaction time for each word.
  • Ernestus, M., & Mak, W. M. (2004). Distinctive phonological features differ in relevance for both spoken and written word recognition. Brain and Language, 90(1-3), 378-392. doi:10.1016/S0093-934X(03)00449-8.

    Abstract

    This paper discusses four experiments on Dutch which show that distinctive phonological features differ in their relevance for word recognition. The relevance of a feature for word recognition depends on its phonological stability, that is, the extent to which that feature is generally realized in accordance with its lexical specification in the relevant word position. If one feature value is uninformative, all values of that feature are less relevant for word recognition, with the least informative feature being the least relevant. Features differ in their relevance both in spoken and written word recognition, though the differences are more pronounced in auditory lexical decision than in self-paced reading.
  • Ernestus, M., & Baayen, R. H. (2004). Analogical effects in regular past tense production in Dutch. Linguistics, 42(5), 873-903. doi:10.1515/ling.2004.031.

    Abstract

    This study addresses the question to what extent the production of regular past tense forms in Dutch is a¤ected by analogical processes. We report an experiment in which native speakers of Dutch listened to existing regular verbs over headphones, and had to indicate which of the past tense allomorphs, te or de, was appropriate for these verbs. According to generative analyses, the choice between the two su‰xes is completely regular and governed by the underlying [voice]-specification of the stem-final segment. In this approach, no analogical e¤ects are expected. In connectionist and analogical approaches, by contrast, the phonological similarity structure in the lexicon is expected to a¤ect lexical processing. Our experimental results support the latter approach: all participants created more nonstandard past tense forms, produced more inconsistency errors, and responded more slowly for verbs with stronger analogical support for the nonstandard form.
  • Ernestus, M., & Baayen, R. H. (2004). Kuchde, tobte, en turfte: Lekkage in 't kofschip. Onze Taal, 73(12), 360-361.
  • Kemps, R. J. J. K., Ernestus, M., Schreuder, R., & Baayen, R. H. (2004). Processing reduced word forms: The suffix restoration effect. Brain and Language, 90(1-3), 117-127. doi:10.1016/S0093-934X(03)00425-5.

    Abstract

    Listeners cannot recognize highly reduced word forms in isolation, but they can do so when these forms are presented in context (Ernestus, Baayen, & Schreuder, 2002). This suggests that not all possible surface forms of words have equal status in the mental lexicon. The present study shows that the reduced forms are linked to the canonical representations in the mental lexicon, and that these latter representations induce reconstruction processes. Listeners restore suffixes that are partly or completely missing in reduced word forms. A series of phoneme-monitoring experiments reveals the nature of this restoration: the basis for suffix restoration is mainly phonological in nature, but orthography has an influence as well.
  • Moscoso del Prado Martín, F., Ernestus, M., & Baayen, R. H. (2004). Do type and token effects reflect different mechanisms? Connectionist modeling of Dutch past-tense formation and final devoicing. Brain and Language, 90(1-3), 287-298. doi:10.1016/j.bandl.2003.12.002.

    Abstract

    In this paper, we show that both token and type-based effects in lexical processing can result from a single, token-based, system, and therefore, do not necessarily reflect different levels of processing. We report three Simple Recurrent Networks modeling Dutch past-tense formation. These networks show token-based frequency effects and type-based analogical effects closely matching the behavior of human participants when producing past-tense forms for both existing verbs and pseudo-verbs. The third network covers the full vocabulary of Dutch, without imposing predefined linguistic structure on the input or output words.
  • Rietveld, T., Van Hout, R., & Ernestus, M. (2004). Pitfalls in corpus research. Computers and the Humanities, 38(4), 343-362. doi:10.1007/s10579-004-1919-1.

    Abstract

    This paper discusses some pitfalls in corpus research and suggests solutions on the basis of examples and computer simulations. We first address reliability problems in language transcriptions, agreement between transcribers, and how disagreements can be dealt with. We then show that the frequencies of occurrence obtained from a corpus cannot always be analyzed with the traditional X2 test, as corpus data are often not sequentially independent and unit independent. Next, we stress the relevance of the power of statistical tests, and the sizes of statistically significant effects. Finally, we point out that a t-test based on log odds often provides a better alternative to a X2 analysis based on frequency counts.

Share this page