Publications

Displaying 1 - 23 of 23
  • Ernestus, M., Dikmans, M., & Giezenaar, G. (2017). Advanced second language learners experience difficulties processing reduced word pronunciation variants. Dutch Journal of Applied Linguistics, 6(1), 1-20. doi:10.1075/dujal.6.1.01ern.

    Abstract

    Words are often pronounced with fewer segments in casual conversations than in formal speech. Previous research has shown that foreign language learners and beginning second language learners experience problems processing reduced speech. We examined whether this also holds for advanced second language learners. We designed a dictation task in Dutch consisting of sentences spliced from casual conversations and an unreduced counterpart of this task, with the same sentences carefully articulated by the same speaker. Advanced second language learners of Dutch produced substantially more transcription errors for the reduced than for the unreduced sentences. These errors made the sentences incomprehensible or led to non-intended meanings. The learners often did not rely on the semantic and syntactic information in the sentence or on the subsegmental cues to overcome the reductions. Hence, advanced second language learners also appear to suffer from the reduced pronunciation variants of words that are abundant in everyday conversations
  • Ernestus, M., Kouwenhoven, H., & Van Mulken, M. (2017). The direct and indirect effects of the phonotactic constraints in the listener's native language on the comprehension of reduced and unreduced word pronunciation variants in a foreign language. Journal of Phonetics, 62, 50-64. doi:10.1016/j.wocn.2017.02.003.

    Abstract

    This study investigates how the comprehension of casual speech in foreign languages is affected by the phonotactic constraints in the listener’s native language. Non-native listeners of English with different native languages heard short English phrases produced by native speakers of English or Spanish and they indicated whether these phrases included can or can’t. Native Mandarin listeners especially tended to interpret can’t as can. We interpret this result as a direct effect of the ban on word-final /nt/ in Mandarin. Both the native Mandarin and the native Spanish listeners did not take full advantage of the subsegmental information in the speech signal cueing reduced can’t. This finding is probably an indirect effect of the phonotactic constraints in their native languages: these listeners have difficulties interpreting the subsegmental cues because these cues do not occur or have different functions in their native languages. Dutch resembles English in the phonotactic constraints relevant to the comprehension of can’t, and native Dutch listeners showed similar patterns in their comprehension of native and non-native English to native English listeners. This result supports our conclusion that the major patterns in the comprehension results are driven by the phonotactic constraints in the listeners’ native languages.
  • Ten Bosch, L., Boves, L., & Ernestus, M. (2017). The recognition of compounds: A computational account. In Proceedings of Interspeech 2017 (pp. 1158-1162). doi:10.21437/Interspeech.2017-1048.

    Abstract

    This paper investigates the processes in comprehending spoken noun-noun compounds, using data from the BALDEY database. BALDEY contains lexicality judgments and reaction times (RTs) for Dutch stimuli for which also linguistic information is included. Two different approaches are combined. The first is based on regression by Dynamic Survival Analysis, which models decisions and RTs as a consequence of the fact that a cumulative density function exceeds some threshold. The parameters of that function are estimated from the observed RT data. The second approach is based on DIANA, a process-oriented computational model of human word comprehension, which simulates the comprehension process with the acoustic stimulus as input. DIANA gives the identity and the number of the word candidates that are activated at each 10 ms time step.

    Both approaches show how the processes involved in comprehending compounds change during a stimulus. Survival Analysis shows that the impact of word duration varies during the course of a stimulus. The density of word and non-word hypotheses in DIANA shows a corresponding pattern with different regimes. We show how the approaches complement each other, and discuss additional ways in which data and process models can be combined.
  • De Vaan, L., Van Krieken, K., Van den Bosch, W., Schreuder, R., & Ernestus, M. (2017). The traces that novel morphologically complex words leave in memory are abstract in nature. The Mental Lexicon, 12(2), 181-218. doi:10.1075/ml.16006.vaa.

    Abstract

    Previous work has shown that novel morphologically complex words (henceforth neologisms) leave traces in memory after just one encounter. This study addressed the question whether these traces are abstract in nature or exemplars. In three experiments, neologisms were either primed by themselves or by their stems. The primes occurred in the visual modality whereas the targets were presented in the auditory modality (Experiment 1) or vice versa (Experiments 2 and 3). The primes were presented in sentences in a selfpaced reading task (Experiment 1) or in stories in a listening comprehension task (Experiments 2 and 3). The targets were incorporated in lexical decision tasks, auditory or visual (Experiment 1 and Experiment 2, respectively), or in stories in a self-paced reading task (Experiment 3). The experimental part containing the targets immediately followed the familiarization phase with the primes (Experiment 1), or after a one week delay (Experiments 2 and 3). In all experiments, participants recognized neologisms faster if they had encountered them before (identity priming) than if the familiarization phase only contained the neologisms’ stems (stem priming). These results show that the priming effects are robust despite substantial differences between the primes and the targets. This suggests that the traces novel morphologically complex words leave in memory after just one encounter are abstract in nature.
  • Viebahn, M., Ernestus, M., & McQueen, J. M. (2017). Speaking style influences the brain’s electrophysiological response to grammatical errors in speech comprehension. Journal of Cognitive Neuroscience, 29(7), 1132-1146. doi:10.1162/jocn_a_01095.

    Abstract

    This electrophysiological study asked whether the brain processes grammatical gender
    violations in casual speech differently than in careful speech. Native speakers of Dutch were
    presented with utterances that contained adjective-noun pairs in which the adjective was either
    correctly inflected with a word-final schwa (e.g. een spannende roman “a suspenseful novel”) or
    incorrectly uninflected without that schwa (een spannend roman). Consistent with previous
    findings, the uninflected adjectives elicited an electrical brain response sensitive to syntactic
    violations when the talker was speaking in a careful manner. When the talker was speaking in a
    casual manner, this response was absent. A control condition showed electrophysiological responses
    for carefully as well as casually produced utterances with semantic anomalies, showing that
    listeners were able to understand the content of both types of utterance. The results suggest that
    listeners take information about the speaking style of a talker into account when processing the
    acoustic-phonetic information provided by the speech signal. Absent schwas in casual speech are
    effectively not grammatical gender violations. These changes in syntactic processing are evidence
    of contextually-driven neural flexibility.

    Files private

    Request files
  • Bürki, A., Ernestus, M., & Frauenfelder, U. H. (2010). Is there only one "fenêtre" in the production lexicon? On-line evidence on the nature of phonological representations of pronunciation variants for French schwa words. Journal of Memory and Language, 62, 421-437. doi:10.1016/j.jml.2010.01.002.

    Abstract

    This study examines whether the production of words with two phonological variants involves single or multiple lexical phonological representations. Three production experiments investigated the roles of the relative frequencies of the two pronunciation variants of French words with schwa: the schwa variant (e.g., Image ) and the reduced variant (e.g., Image ). In two naming tasks and in a symbol–word association learning task, variants with higher relative frequencies were produced faster. This suggests that the production lexicon keeps a frequency count for each variant and hence that schwa words are represented in the production lexicon with two different lexemes. In addition, the advantage for schwa variants over reduced variants in the naming tasks but not in the learning task and the absence of a variant relative frequency effect for schwa variants produced in isolation support the hypothesis that context affects the variants’ lexical activation and modulates the effect of variant relative frequency.
  • Hanique, I., Schuppler, B., & Ernestus, M. (2010). Morphological and predictability effects on schwa reduction: The case of Dutch word-initial syllables. In Proceedings of the 11th Annual Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan (pp. 933-936).

    Abstract

    This corpus-based study shows that the presence and duration of schwa in Dutch word-initial syllables are affected by a word’s predictability and its morphological structure. Schwa is less reduced in words that are more predictable given the following word. In addition, schwa may be longer if the syllable forms a prefix, and in prefixes the duration of schwa is positively correlated with the frequency of the word relative to its stem. Our results suggest that the conditions which favor reduced realizations are more complex than one would expect on the basis of the current literature.
  • Kuzla, C., Ernestus, M., & Mitterer, H. (2010). Compensation for assimilatory devoicing and prosodic structure in German fricative perception. In C. Fougeron, B. Kühnert, M. D'Imperio, & N. Vallée (Eds.), Laboratory Phonology 10 (pp. 731-757). Berlin: De Gruyter.
  • Pluymaekers, M., Ernestus, M., Baayen, R. H., & Booij, G. (2010). Morphological effects on fine phonetic detail: The case of Dutch -igheid. In C. Fougeron, B. Kühnert, M. D'Imperio, & N. Vallée (Eds.), Laboratory Phonology 10 (pp. 511-532). Berlin: De Gruyter.
  • Scharenborg, O., Wan, V., & Ernestus, M. (2010). Unsupervised speech segmentation: An analysis of the hypothesized phone boundaries. Journal of the Acoustical Society of America, 127, 1084-1095. doi:10.1121/1.3277194.

    Abstract

    Despite using different algorithms, most unsupervised automatic phone segmentation methods achieve similar performance in terms of percentage correct boundary detection. Nevertheless, unsupervised segmentation algorithms are not able to perfectly reproduce manually obtained reference transcriptions. This paper investigates fundamental problems for unsupervised segmentation algorithms by comparing a phone segmentation obtained using only the acoustic information present in the signal with a reference segmentation created by human transcribers. The analyses of the output of an unsupervised speech segmentation method that uses acoustic change to hypothesize boundaries showed that acoustic change is a fairly good indicator of segment boundaries: over two-thirds of the hypothesized boundaries coincide with segment boundaries. Statistical analyses showed that the errors are related to segment duration, sequences of similar segments, and inherently dynamic phones. In order to improve unsupervised automatic speech segmentation, current one-stage bottom-up segmentation methods should be expanded into two-stage segmentation methods that are able to use a mix of bottom-up information extracted from the speech signal and automatically derived top-down information. In this way, unsupervised methods can be improved while remaining flexible and language-independent.
  • Schuppler, B., Ernestus, M., Van Dommelen, W., & Koreman, J. (2010). Predicting human perception and ASR classification of word-final [t] by its acoustic sub-segmental properties. In Proceedings of the 11th Annual Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan (pp. 2466-2469).

    Abstract

    This paper presents a study on the acoustic sub-segmental properties of word-final /t/ in conversational standard Dutch and how these properties contribute to whether humans and an ASR system classify the /t/ as acoustically present or absent. In general, humans and the ASR system use the same cues (presence of a constriction, a burst, and alveolar frication), but the ASR system is also less sensitive to fine cues (weak bursts, smoothly starting friction) than human listeners and misled by the presence of glottal vibration. These data inform the further development of models of human and automatic speech processing.
  • Sikveland, A., Öttl, A., Amdal, I., Ernestus, M., Svendsen, T., & Edlund, J. (2010). Spontal-N: A Corpus of Interactional Spoken Norwegian. In N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, & D. Tapias (Eds.), Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10) (pp. 2986-2991). Paris: European Language Resources Association (ELRA).

    Abstract

    Spontal-N is a corpus of spontaneous, interactional Norwegian. To our knowledge, it is the first corpus of Norwegian in which the majority of speakers have spent significant parts of their lives in Sweden, and in which the recorded speech displays varying degrees of interference from Swedish. The corpus consists of studio quality audio- and video-recordings of four 30-minute free conversations between acquaintances, and a manual orthographic transcription of the entire material. On basis of the orthographic transcriptions, we automatically annotated approximately 50 percent of the material on the phoneme level, by means of a forced alignment between the acoustic signal and pronunciations listed in a dictionary. Approximately seven percent of the automatic transcription was manually corrected. Taking the manual correction as a gold standard, we evaluated several sources of pronunciation variants for the automatic transcription. Spontal-N is intended as a general purpose speech resource that is also suitable for investigating phonetic detail.
  • Spilková, H., Brenner, D., Öttl, A., Vondřička, P., Van Dommelen, W., & Ernestus, M. (2010). The Kachna L1/L2 picture replication corpus. In N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, & D. Tapias (Eds.), Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10) (pp. 2432-2436). Paris: European Language Resources Association (ELRA).

    Abstract

    This paper presents the Kachna corpus of spontaneous speech, in which ten Czech and ten Norwegian speakers were recorded both in their native language and in English. The dialogues are elicited using a picture replication task that requires active cooperation and interaction of speakers by asking them to produce a drawing as close to the original as possible. The corpus is appropriate for the study of interactional features and speech reduction phenomena across native and second languages. The combination of productions in non-native English and in speakers’ native language is advantageous for investigation of L2 issues while providing a L1 behaviour reference from all the speakers. The corpus consists of 20 dialogues comprising 12 hours 53 minutes of recording, and was collected in 2008. Preparation of the transcriptions, including a manual orthographic transcription and an automatically generated phonetic transcription, is currently in progress. The phonetic transcriptions are automatically generated by aligning acoustic models with the speech signal on the basis of the orthographic transcriptions and a dictionary of pronunciation variants compiled for the relevant language. Upon completion the corpus will be made available via the European Language Resources Association (ELRA).
  • Torreira, F., & Ernestus, M. (2010). Phrase-medial vowel devoicing in spontaneous French. In Proceedings of the 11th Annual Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan (pp. 2006-2009).

    Abstract

    This study investigates phrase-medial vowel devoicing in European French (e.g. /ty po/ [typo] 'you can'). Our spontaneous speech data confirm that French phrase-medial devoicing is a frequent phenomenon affecting high vowels preceded by voiceless consonants. We also found that devoicing is more frequent in temporally reduced and coarticulated vowels. Complete and partial devoicing were conditioned by the same variables (speech rate, consonant type and distance from the end of the AP). Given these results, we propose that phrase-medial vowel devoicing in French arises mainly from the temporal compression of vocalic gestures and the aerodynamic conditions imposed by high vowels.
  • Torreira, F., Adda-Decker, M., & Ernestus, M. (2010). The Nijmegen corpus of casual French. Speech Communication, 52, 201-212. doi:10.1016/j.specom.2009.10.004.

    Abstract

    This article describes the preparation, recording and orthographic transcription of a new speech corpus, the Nijmegen Corpus of Casual French (NCCFr). The corpus contains a total of over 36 h of recordings of 46 French speakers engaged in conversations with friends. Casual speech was elicited during three different parts, which together provided around 90 min of speech from every pair of speakers. While Parts 1 and 2 did not require participants to perform any specific task, in Part 3 participants negotiated a common answer to general questions about society. Comparisons with the ESTER corpus of journalistic speech show that the two corpora contain speech of considerably different registers. A number of indicators of casualness, including swear words, casual words, verlan, disfluencies and word repetitions, are more frequent in the NCCFr than in the ESTER corpus, while the use of double negation, an indicator of formal speech, is less frequent. In general, these estimates of casualness are constant through the three parts of the recording sessions and across speakers. Based on these facts, we conclude that our corpus is a rich resource of highly casual speech, and that it can be effectively exploited by researchers in language science and technology.

    Files private

    Request files
  • Torreira, F., & Ernestus, M. (2010). The Nijmegen corpus of casual Spanish. In N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, & D. Tapias (Eds.), Proceedings of the Seventh Conference on International Language Resources and Evaluation (LREC'10) (pp. 2981-2985). Paris: European Language Resources Association (ELRA).

    Abstract

    This article describes the preparation, recording and orthographic transcription of a new speech corpus, the Nijmegen Corpus of Casual Spanish (NCCSp). The corpus contains around 30 hours of recordings of 52 Madrid Spanish speakers engaged in conversations with friends. Casual speech was elicited during three different parts, which together provided around ninety minutes of speech from every group of speakers. While Parts 1 and 2 did not require participants to perform any specific task, in Part 3 participants negotiated a common answer to general questions about society. Information about how to obtain a copy of the corpus can be found online at http://mirjamernestus.ruhosting.nl/Ernestus/NCCSp
  • Van de Ven, M., Tucker, B. V., & Ernestus, M. (2010). Semantic facilitation in bilingual everyday speech comprehension. In Proceedings of the 11th Annual Conference of the International Speech Communication Association (Interspeech 2010), Makuhari, Japan (Interspeech 2010), Makuhari, Japan (pp. 1245-1248).

    Abstract

    Previous research suggests that bilinguals presented with low and high predictability sentences benefit from semantics in clear but not in conversational speech [1]. In everyday speech, however, many words are not highly predictable. Previous research has shown that native listeners can use also more subtle semantic contextual information [2]. The present study reports two auditory lexical decision experiments investigating to what extent late Asian-English bilinguals benefit from subtle semantic cues in their processing of English unreduced and reduced speech. Our results indicate that these bilinguals are less sensitive to semantic cues than native listeners for both speech registers.
  • Ernestus, M., & Mak, W. M. (2004). Distinctive phonological features differ in relevance for both spoken and written word recognition. Brain and Language, 90(1-3), 378-392. doi:10.1016/S0093-934X(03)00449-8.

    Abstract

    This paper discusses four experiments on Dutch which show that distinctive phonological features differ in their relevance for word recognition. The relevance of a feature for word recognition depends on its phonological stability, that is, the extent to which that feature is generally realized in accordance with its lexical specification in the relevant word position. If one feature value is uninformative, all values of that feature are less relevant for word recognition, with the least informative feature being the least relevant. Features differ in their relevance both in spoken and written word recognition, though the differences are more pronounced in auditory lexical decision than in self-paced reading.
  • Ernestus, M., & Baayen, R. H. (2004). Analogical effects in regular past tense production in Dutch. Linguistics, 42(5), 873-903. doi:10.1515/ling.2004.031.

    Abstract

    This study addresses the question to what extent the production of regular past tense forms in Dutch is a¤ected by analogical processes. We report an experiment in which native speakers of Dutch listened to existing regular verbs over headphones, and had to indicate which of the past tense allomorphs, te or de, was appropriate for these verbs. According to generative analyses, the choice between the two su‰xes is completely regular and governed by the underlying [voice]-specification of the stem-final segment. In this approach, no analogical e¤ects are expected. In connectionist and analogical approaches, by contrast, the phonological similarity structure in the lexicon is expected to a¤ect lexical processing. Our experimental results support the latter approach: all participants created more nonstandard past tense forms, produced more inconsistency errors, and responded more slowly for verbs with stronger analogical support for the nonstandard form.
  • Ernestus, M., & Baayen, R. H. (2004). Kuchde, tobte, en turfte: Lekkage in 't kofschip. Onze Taal, 73(12), 360-361.
  • Kemps, R. J. J. K., Ernestus, M., Schreuder, R., & Baayen, R. H. (2004). Processing reduced word forms: The suffix restoration effect. Brain and Language, 90(1-3), 117-127. doi:10.1016/S0093-934X(03)00425-5.

    Abstract

    Listeners cannot recognize highly reduced word forms in isolation, but they can do so when these forms are presented in context (Ernestus, Baayen, & Schreuder, 2002). This suggests that not all possible surface forms of words have equal status in the mental lexicon. The present study shows that the reduced forms are linked to the canonical representations in the mental lexicon, and that these latter representations induce reconstruction processes. Listeners restore suffixes that are partly or completely missing in reduced word forms. A series of phoneme-monitoring experiments reveals the nature of this restoration: the basis for suffix restoration is mainly phonological in nature, but orthography has an influence as well.
  • Moscoso del Prado Martín, F., Ernestus, M., & Baayen, R. H. (2004). Do type and token effects reflect different mechanisms? Connectionist modeling of Dutch past-tense formation and final devoicing. Brain and Language, 90(1-3), 287-298. doi:10.1016/j.bandl.2003.12.002.

    Abstract

    In this paper, we show that both token and type-based effects in lexical processing can result from a single, token-based, system, and therefore, do not necessarily reflect different levels of processing. We report three Simple Recurrent Networks modeling Dutch past-tense formation. These networks show token-based frequency effects and type-based analogical effects closely matching the behavior of human participants when producing past-tense forms for both existing verbs and pseudo-verbs. The third network covers the full vocabulary of Dutch, without imposing predefined linguistic structure on the input or output words.
  • Rietveld, T., Van Hout, R., & Ernestus, M. (2004). Pitfalls in corpus research. Computers and the Humanities, 38(4), 343-362. doi:10.1007/s10579-004-1919-1.

    Abstract

    This paper discusses some pitfalls in corpus research and suggests solutions on the basis of examples and computer simulations. We first address reliability problems in language transcriptions, agreement between transcribers, and how disagreements can be dealt with. We then show that the frequencies of occurrence obtained from a corpus cannot always be analyzed with the traditional X2 test, as corpus data are often not sequentially independent and unit independent. Next, we stress the relevance of the power of statistical tests, and the sizes of statistically significant effects. Finally, we point out that a t-test based on log odds often provides a better alternative to a X2 analysis based on frequency counts.

Share this page