Publications

Displaying 1 - 17 of 17
  • Brand, S., & Ernestus, M. (2021). Reduction of word-final obstruent-liquid-schwa clusters in Parisian French. Corpus Linguistics and Linguistic Theory, 17(1), 249-285. doi:10.1515/cllt-2017-0067.

    Abstract

    This corpus study investigated pronunciation variants of word-final obstruent-liquid-schwa (OLS) clusters in nouns in casual Parisian French. Results showed that at least one phoneme was absent in 80.7% of the 291 noun tokens in the dataset, and that the whole cluster was absent (e.g., [mis] for ministre) in no less than 15.5% of the tokens. We demonstrate that phonemes are not always completely absent, but that they may leave traces on neighbouring phonemes. Further, the clusters display undocumented voice assimilation patterns. Statistical modelling showed that a phoneme is most likely to be absent if the following phoneme is also absent. The durations of the phonemes are conditioned particularly by the position of the word in the prosodic phrase. We argue, on the basis of three different types of evidence, that in French word-final OLS clusters, the absence of obstruents is mainly due to gradient reduction processes, whereas the absence of schwa and liquids may also be due to categorical deletion processes.
  • Felker, E. R., Broersma, M., & Ernestus, M. (2021). The role of corrective feedback and lexical guidance in perceptual learning of a novel L2 accent in dialogue. Applied Psycholinguistics, 42, 1029-1055. doi:10.1017/S0142716421000205.

    Abstract

    Perceptual learning of novel accents is a critical skill for second-language speech perception, but little is known about the mechanisms that facilitate perceptual learning in communicative contexts. To study perceptual learning in an interactive dialogue setting while maintaining experimental control of the phonetic input, we employed an innovative experimental method incorporating prerecorded speech into a naturalistic conversation. Using both computer-based and face-to-face dialogue settings, we investigated the effect of two types of learning mechanisms in interaction: explicit corrective feedback and implicit lexical guidance. Dutch participants played an information-gap game featuring minimal pairs with an accented English speaker whose /ε/ pronunciations were shifted to /ɪ/. Evidence for the vowel shift came either from corrective feedback about participants’ perceptual mistakes or from onscreen lexical information that constrained their interpretation of the interlocutor’s words. Corrective feedback explicitly contrasting the minimal pairs was more effective than generic feedback. Additionally, both receiving lexical guidance and exhibiting more uptake for the vowel shift improved listeners’ subsequent online processing of accented words. Comparable learning effects were found in both the computer-based and face-to-face interactions, showing that our results can be generalized to a more naturalistic learning context than traditional computer-based perception training programs.
  • Merkx, D., Frank, S. L., & Ernestus, M. (2021). Semantic sentence similarity: Size does not always matter. In Proceedings of Interspeech 2021 (pp. 4393-4397). doi:10.21437/Interspeech.2021-1464.

    Abstract

    This study addresses the question whether visually grounded speech recognition (VGS) models learn to capture sentence semantics without access to any prior linguistic knowledge. We produce synthetic and natural spoken versions of a well known semantic textual similarity database and show that our VGS model produces embeddings that correlate well with human semantic similarity judgements. Our results show that a model trained on a small image-caption database outperforms two models trained on much larger databases, indicating that database size is not all that matters. We also investigate the importance of having multiple captions per image and find that this is indeed helpful even if the total number of images is lower, suggesting that paraphrasing is a valuable learning signal. While the general trend in the field is to create ever larger datasets to train models on, our findings indicate other characteristics of the database can just as important.
  • Ernestus, M. (2014). Acoustic reduction and the roles of abstractions and exemplars in speech processing. Lingua, 142, 27-41. doi:10.1016/j.lingua.2012.12.006.

    Abstract

    Acoustic reduction refers to the frequent phenomenon in conversational speech that words are produced with fewer or lenited segments compared to their citation forms. The few published studies on the production and comprehension of acoustic reduction have important implications for the debate on the relevance of abstractions and exemplars in speech processing. This article discusses these implications. It first briefly introduces the key assumptions of simple abstractionist and simple exemplar-based models. It then discusses the literature on acoustic reduction and draws the conclusion that both types of models need to be extended to explain all findings. The ultimate model should allow for the storage of different pronunciation variants, but also reserve an important role for phonetic implementation. Furthermore, the recognition of a highly reduced pronunciation variant requires top down information and leads to activation of the corresponding unreduced variant, the variant that reaches listeners’ consciousness. These findings are best accounted for in hybrids models, assuming both abstract representations and exemplars. None of the hybrid models formulated so far can account for all data on reduced speech and we need further research for obtaining detailed insight into how speakers produce and listeners comprehend reduced speech.
  • Ernestus, M., & Giezenaar, G. (2014). Een goed verstaander heeft maar een half woord nodig. In B. Bossers (Ed.), Vakwerk 9: Achtergronden van de NT2-lespraktijk: Lezingen conferentie Hoeven 2014 (pp. 81-92). Amsterdam: BV NT2.
  • Ernestus, M., Kočková-Amortová, L., & Pollak, P. (2014). The Nijmegen corpus of casual Czech. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 365-370).

    Abstract

    This article introduces a new speech corpus, the Nijmegen Corpus of Casual Czech (NCCCz), which contains more than 30 hours of high-quality recordings of casual conversations in Common Czech, among ten groups of three male and ten groups of three female friends. All speakers were native speakers of Czech, raised in Prague or in the region of Central Bohemia, and were between 19 and 26 years old. Every group of speakers consisted of one confederate, who was instructed to keep the conversations lively, and two speakers naive to the purposes of the recordings. The naive speakers were engaged in conversations for approximately 90 minutes, while the confederate joined them for approximately the last 72 minutes. The corpus was orthographically annotated by experienced transcribers and this orthographic transcription was aligned with the speech signal. In addition, the conversations were videotaped. This corpus can form the basis for all types of research on casual conversations in Czech, including phonetic research and research on how to improve automatic speech recognition. The corpus will be freely available
  • Lahey, M., & Ernestus, M. (2014). Pronunciation variation in infant-directed speech: Phonetic reduction of two highly frequent words. Language Learning and Development, 10, 308-327. doi:10.1080/15475441.2013.860813.

    Abstract

    In spontaneous conversations between adults, words are often pronounced with fewer segments or syllables than their citation forms. The question arises whether infant-directed speech also contains phonetic reduction. If so, infants would be presented with speech input that enables them to acquire reduced variants from an early age. This study compared speech directed at 11- and 12-month-old infants with adult-directed conversational speech and adult-directed read speech. In an acoustic study, 216 tokens of the Dutch words allemaal and helemaal from speech corpora were analyzed for duration, number of syllables, and vowel quality. In a perception study, adult participants rated these same materials for reduction and provided phonetic transcriptions. The results show that these two words are frequently reduced in infant-directed speech, and that their degree of reduction is comparable with conversational adult-directed speech. These findings suggest that lexical representations for reduced pronunciation variants can be acquired early in linguistic development

    Files private

    Request files
  • Mizera, P., Pollak, P., Kolman, A., & Ernestus, M. (2014). Impact of irregular pronunciation on phonetic segmentation of Nijmegen corpus of Casual Czech. In P. Sojka, A. Horák, I. Kopecek, & K. Pala (Eds.), Text, Speech and Dialogue: 17th International Conference, TSD 2014, Brno, Czech Republic, September 8-12, 2014. Proceedings (pp. 499-506). Heidelberg: Springer.

    Abstract

    This paper describes the pilot study of phonetic segmentation applied to Nijmegen Corpus of Casual Czech (NCCCz). This corpus contains informal speech of strong spontaneous nature which influences the character of produced speech at various levels. This work is the part of wider research related to the analysis of pronunciation reduction in such informal speech. We present the analysis of the accuracy of phonetic segmentation when canonical or reduced pronunciation is used. The achieved accuracy of realized phonetic segmentation provides information about general accuracy of proper acoustic modelling which is supposed to be applied in spontaneous speech recognition. As a byproduct of presented spontaneous speech segmentation, this paper also describes the created lexicon with canonical pronunciations of words in NCCCz, a tool supporting pronunciation check of lexicon items, and finally also a minidatabase of selected utterances from NCCCz manually labelled on phonetic level suitable for evaluation purposes
  • Schertz, J., & Ernestus, M. (2014). Variability in the pronunciation of non-native English the: Effects of frequency and disfluencies. Corpus Linguistics and Linguistic Theory, 10, 329-345. doi:10.1515/cllt-2014-0024.

    Abstract

    This study examines how lexical frequency and planning problems can predict phonetic variability in the function word ‘the’ in conversational speech produced by non-native speakers of English. We examined 3180 tokens of ‘the’ drawn from English conversations between native speakers of Czech or Norwegian. Using regression models, we investigated the effect of following word frequency and disfluencies on three phonetic parameters: vowel duration, vowel quality, and consonant quality. Overall, the non-native speakers showed variation that is very similar to the variation displayed by native speakers of English. Like native speakers, Czech speakers showed an effect of frequency on vowel durations, which were shorter in more frequent word sequences. Both groups of speakers showed an effect of frequency on consonant quality: the substitution of another consonant for /ð/ occurred more often in the context of more frequent words. The speakers in this study also showed a native-like allophonic distinction in vowel quality, in which /ði/ occurs more often before vowels and /ðə/ before consonants. Vowel durations were longer in the presence of following disfluencies, again mirroring patterns in native speakers, and the consonant quality was more likely to be the target /ð/ before disfluencies, as opposed to a different consonant. The fact that non-native speakers show native-like sensitivity to lexical frequency and disfluencies suggests that these effects are consequences of a general, non-language-specific production mechanism governing language planning. On the other hand, the non-native speakers in this study did not show native-like patterns of vowel quality in the presence of disfluencies, suggesting that the pattern attested in native speakers of English may result from language-specific processes separate from the general production mechanisms
  • Ten Bosch, L., Ernestus, M., & Boves, L. (2014). Comparing reaction time sequences from human participants and computational models. In Proceedings of Interspeech 2014: 15th Annual Conference of the International Speech Communication Association (pp. 462-466).

    Abstract

    This paper addresses the question how to compare reaction times computed by a computational model of speech comprehension with observed reaction times by participants. The question is based on the observation that reaction time sequences substantially differ per participant, which raises the issue of how exactly the model is to be assessed. Part of the variation in reaction time sequences is caused by the so-called local speed: the current reaction time correlates to some extent with a number of previous reaction times, due to slowly varying variations in attention, fatigue etc. This paper proposes a method, based on time series analysis, to filter the observed reaction times in order to separate the local speed effects. Results show that after such filtering the between-participant correlations increase as well as the average correlation between participant and model increases. The presented technique provides insights into relevant aspects that are to be taken into account when comparing reaction time sequences
  • Ernestus, M., Mak, W. M., & Baayen, R. H. (2005). Waar 't kofschip strandt. Levende Talen Magazine, 92, 9-11.
  • Ernestus, M., & Mak, W. M. (2005). Analogical effects in reading Dutch verb forms. Memory & Cognition, 33(7), 1160-1173.

    Abstract

    Previous research has shown that the production of morphologically complex words in isolation is affected by the properties of morphologically, phonologically, or semantically similar words stored in the mental lexicon. We report five experiments with Dutch speakers that show that reading an inflectional word form in its linguistic context is also affected by analogical sets of formally similar words. Using the self-paced reading technique, we show in Experiments 1-3 that an incorrectly spelled suffix delays readers less if the incorrect spelling is in line with the spelling of verbal suffixes in other inflectional forms of the same verb. In Experiments 4 and 5, our use of the self-paced reading technique shows that formally similar words with different stems affect the reading of incorrect suffixal allomorphs on a given stem. These intra- and interparadigmatic effects in reading may be due to online processes or to the storage of incorrect forms resulting from analogical effects in production.
  • Kemps, R. J. J. K., Wurm, L. H., Ernestus, M., Schreuder, R., & Baayen, R. H. (2005). Prosodic cues for morphological complexity in Dutch and English. Language and Cognitive Processes, 20(1/2), 43-73. doi:10.1080/01690960444000223.

    Abstract

    Previous work has shown that Dutch listeners use prosodic information in the speech signal to optimise morphological processing: Listeners are sensitive to prosodic differences between a noun stem realised in isolation and a noun stem realised as part of a plural form (in which the stem is followed by an unstressed syllable). The present study, employing a lexical decision task, provides an additional demonstration of listeners' sensitivity to prosodic cues in the stem. This sensitivity is shown for two languages that differ in morphological productivity: Dutch and English. The degree of morphological productivity does not correlate with listeners' sensitivity to prosodic cues in the stem, but it is reflected in differential sensitivities to the word-specific log odds ratio of encountering an unshortened stem (i.e., a stem in isolation) versus encountering a shortened stem (i.e., a stem followed by a suffix consisting of one or more unstressed syllables). In addition to being sensitive to the prosodic cues themselves, listeners are also sensitive to the probabilities of occurrence of these prosodic cues.
  • Kemps, R. J. J. K., Ernestus, M., Schreuder, R., & Baayen, R. H. (2005). Prosodic cues for morphological complexity: The case of Dutch plural nouns. Memory & Cognition, 33(3), 430-446.

    Abstract

    It has recently been shown that listeners use systematic differences in vowel length and intonation to resolve ambiguities between onset-matched simple words (Davis, Marslen-Wilson, & Gaskell, 2002; Salverda, Dahan, & McQueen, 2003). The present study shows that listeners also use prosodic information in the speech signal to optimize morphological processing. The precise acoustic realization of the stem provides crucial information to the listener about the morphological context in which the stem appears and attenuates the competition between stored inflectional variants. We argue that listeners are able to make use of prosodic information, even though the speech signal is highly variable within and between speakers, by virtue of the relative invariance of the duration of the onset. This provides listeners with a baseline against which the durational cues in a vowel and a coda can be evaluated. Furthermore, our experiments provide evidence for item-specific prosodic effects.
  • Keune, K., Ernestus, M., Van Hout, R., & Baayen, R. H. (2005). Variation in Dutch: From written "mogelijk" to spoken "mok". Corpus Linguistics and Linguistic Theory, 1(2), 183-223. doi:10.1515/cllt.2005.1.2.183.

    Abstract

    In Dutch, high-frequency words with the suffix -lijk are often highly reduced in spontaneous unscripted speech. This study addressed socio-geographic variation in the reduction of such words against the backdrop of the variation in their use in written and spoken Dutch. Multivariate analyses of the frequencies with which the words were used in a factorially contrasted set of subcorpora revealed signi ficant variation involving the speaker's country, sex, and education level for spoken Dutch, and involving country and register for written Dutch. Acoustic analyses revealed that Dutch men reduced most often, while Flemish highly educated women reduced least. Two linguistic context effects emerged, one prosodic, and the other pertaining to the flow of information. Words in sentence final position showed less reduction, while words that were better predictable from the preceding word in the sentence(based on mutual information) tended to be reduced more often. The increased probability of reduction for forms that are more predictable in context, combined with the loss of the suffix in the more extremely reduced forms, suggests that highfrequency words in -lijk are undergoing a process of erosion that causes them to gravitate towards monomorphemic function words.
  • Pluymaekers, M., Ernestus, M., & Baayen, R. H. (2005). Articulatory planning is continuous and sensitive to informational redundancy. Phonetica, 62(2-4), 146-159. doi:10.1159/000090095.

    Abstract

    This study investigates the relationship between word repetition, predictability from neighbouring words, and articulatory reduction in Dutch. For the seven most frequent words ending in the adjectival suffix -lijk, 40 occurrences were randomly selected from a large database of face-to-face conversations. Analysis of the selected tokens showed that the degree of articulatory reduction (as measured by duration and number of realized segments) was affected by repetition, predictability from the previous word and predictability from the following word. Interestingly, not all of these effects were significant across morphemes and target words. Repetition effects were limited to suffixes, while effects of predictability from the previous word were restricted to the stems of two of the seven target words. Predictability from the following word affected the stems of all target words equally, but not all suffixes. The implications of these findings for models of speech production are discussed.
  • Pluymaekers, M., Ernestus, M., & Baayen, R. H. (2005). Lexical frequency and acoustic reduction in spoken Dutch. Journal of the Acoustical Society of America, 118(4), 2561-2569. doi:10.1121/1.2011150.

    Abstract

    This study investigates the effects of lexical frequency on the durational reduction of morphologically complex words in spoken Dutch. The hypothesis that high-frequency words are more reduced than low-frequency words was tested by comparing the durations of affixes occurring in different carrier words. Four Dutch affixes were investigated, each occurring in a large number of words with different frequencies. The materials came from a large database of face-to-face conversations. For each word containing a target affix, one token was randomly selected for acoustic analysis. Measurements were made of the duration of the affix as a whole and the durations of the individual segments in the affix. For three of the four affixes, a higher frequency of the carrier word led to shorter realizations of the affix as a whole, individual segments in the affix, or both. Other relevant factors were the sex and age of the speaker, segmental context, and speech rate. To accommodate for these findings, models of speech production should allow word frequency to affect the acoustic realizations of lower-level units, such as individual speech sounds occurring in affixes.

Share this page