Publications

Displaying 1 - 21 of 21
  • Bentum, M., Ten Bosch, L., Van den Bosch, A., & Ernestus, M. (2022). Speech register influences listeners’ word expectations. Brain and Language, 235: 105197. doi:10.1016/j.bandl.2022.105197.

    Abstract

    We utilized the N400 effect to investigate the influence of speech register on predictive language processing. Participants listened to long stretches (4 – 15 min) of naturalistic speech from different registers (dialogues, news broadcasts, and read-aloud books), totalling approximately 50,000 words, while the EEG signal was recorded. We estimated the surprisal of words in the speech materials with the aid of a statistical language model in such a manner that it reflected different predictive processing strategies; generic, register-specific, or recency-based. The N400 amplitude was best predicted with register-specific word surprisal, indicating that the statistics of the wider context (i.e., register) influences predictive language processing. Furthermore, adaptation to speech register cannot merely be explained by recency effects; instead, listeners adapt their word anticipations to the presented speech register.
  • Cutler, A., Ernestus, M., Warner, N., & Weber, A. (2022). Managing speech perception data sets. In B. McDonnell, E. Koller, & L. B. Collister (Eds.), The Open Handbook of Linguistic Data Management (pp. 565-573). Cambrdige, MA, USA: MIT Press. doi:10.7551/mitpress/12200.003.0055.
  • Eijk, L., Rasenberg, M., Arnese, F., Blokpoel, M., Dingemanse, M., Doeller, C. F., Ernestus, M., Holler, J., Milivojevic, B., Özyürek, A., Pouw, W., Van Rooij, I., Schriefers, H., Toni, I., Trujillo, J. P., & Bögels, S. (2022). The CABB dataset: A multimodal corpus of communicative interactions for behavioural and neural analyses. NeuroImage, 264: 119734. doi:10.1016/j.neuroimage.2022.119734.

    Abstract

    We present a dataset of behavioural and fMRI observations acquired in the context of humans involved in multimodal referential communication. The dataset contains audio/video and motion-tracking recordings of face-to-face, task-based communicative interactions in Dutch, as well as behavioural and neural correlates of participants’ representations of dialogue referents. Seventy-one pairs of unacquainted participants performed two interleaved interactional tasks in which they described and located 16 novel geometrical objects (i.e., Fribbles) yielding spontaneous interactions of about one hour. We share high-quality video (from three cameras), audio (from head-mounted microphones), and motion-tracking (Kinect) data, as well as speech transcripts of the interactions. Before and after engaging in the face-to-face communicative interactions, participants’ individual representations of the 16 Fribbles were estimated. Behaviourally, participants provided a written description (one to three words) for each Fribble and positioned them along 29 independent conceptual dimensions (e.g., rounded, human, audible). Neurally, fMRI signal evoked by each Fribble was measured during a one-back working-memory task. To enable functional hyperalignment across participants, the dataset also includes fMRI measurements obtained during visual presentation of eight animated movies (35 minutes total). We present analyses for the various types of data demonstrating their quality and consistency with earlier research. Besides high-resolution multimodal interactional data, this dataset includes different correlates of communicative referents, obtained before and after face-to-face dialogue, allowing for novel investigations into the relation between communicative behaviours and the representational space shared by communicators. This unique combination of data can be used for research in neuroscience, psychology, linguistics, and beyond.
  • Marcoux, K., Cooke, M., Tucker, B. V., & Ernestus, M. (2022). The Lombard intelligibility benefit of native and non-native speech for native and non-native listeners. Speech Communication, 136, 53-62. doi:10.1016/j.specom.2021.11.007.

    Abstract

    Speech produced in noise (Lombard speech) is more intelligible than speech produced in quiet (plain speech). Previous research on the Lombard intelligibility benefit focused almost entirely on how native speakers produce and perceive Lombard speech. In this study, we investigate the size of the Lombard intelligibility benefit of both native (American-English) and non-native (native Dutch) English for native and non-native listeners (Dutch and Spanish). We used a glimpsing metric to measure the energetic masking potential of speech, which predicted that both native and non-native Lombard speech could withstand greater amounts of masking to a similar extent, compared to plain speech. In an intelligibility experiment, native English, Spanish, and Dutch listeners listened to the same words, mixed with noise. While the non-native listeners appeared to benefit more from Lombard speech than the native listeners did, each listener group experienced a similar benefit for native and non-native Lombard speech. Energetic masking, as captured by the glimpsing metric, only accounted for part of the Lombard benefit, indicating that the Lombard intelligibility benefit does not only result from a shift in spectral distribution. Despite subtle native language influences on non-native Lombard speech, both native and non-native speech provides a Lombard benefit.
  • Merkx, D., Frank, S. L., & Ernestus, M. (2022). Seeing the advantage: Visually grounding word embeddings to better capture human semantic knowledge. In E. Chersoni, N. Hollenstein, C. Jacobs, Y. Oseki, L. Prévot, & E. Santus (Eds.), Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics (CMCL 2022) (pp. 1-11). Stroudsburg, PA, USA: Association for Computational Linguistics (ACL).

    Abstract

    Distributional semantic models capture word-level meaning that is useful in many natural language processing tasks and have even been shown to capture cognitive aspects of word meaning. The majority of these models are purely text based, even though the human sensory experience is much richer. In this paper we create visually grounded word embeddings by combining English text and images and compare them to popular text-based methods, to see if visual information allows our model to better capture cognitive aspects of word meaning. Our analysis shows that visually grounded embedding similarities are more predictive of the human reaction times in a large priming experiment than the purely text-based embeddings. The visually grounded embeddings also correlate well with human word similarity ratings.Importantly, in both experiments we show that he grounded embeddings account for a unique portion of explained variance, even when we include text-based embeddings trained on huge corpora. This shows that visual grounding allows our model to capture information that cannot be extracted using text as the only source of information.
  • Nijveld, A., Ten Bosch, L., & Ernestus, M. (2022). The use of exemplars differs between native and non-native listening. Bilingualism: Language and Cognition, 25(5), 841-855. doi:10.1017/S1366728922000116.

    Abstract

    This study compares the role of exemplars in native and non-native listening. Two English identity priming experiments were conducted with native English, Dutch non-native, and Spanish non-native listeners. In Experiment 1, primes and targets were spoken in the same or a different voice. Only the native listeners showed exemplar effects. In Experiment 2, primes and targets had the same or a different degree of vowel reduction. The Dutch, but not the Spanish, listeners were familiar with this reduction pattern from their L1 phonology. In this experiment, exemplar effects only arose for the Spanish listeners. We propose that in these lexical decision experiments the use of exemplars is co-determined by listeners’ available processing resources, which is modulated by the familiarity with the variation type from their L1 phonology. The use of exemplars differs between native and non-native listening, suggesting qualitative differences between native and non-native speech comprehension processes.
  • Baayen, H., Levelt, W. J. M., Schreuder, R., & Ernestus, M. (2007). Paradigmatic structure in speech production. Proceedings from the Annual Meeting of the Chicago Linguistic Society, 43(1), 1-29.

    Abstract

    The main goal of the present study is to trace the consequences of local and global markedness for the processing of singular and plural nouns. Decompositional models such as proposed by (Pinker (1997); Pinker (1999)) and (Levelt et al. (1999)) predict a lexeme frequency effect and no effects of the frequencies of the singular and the plural forms. Experiments 1 and 4 reveal the expected lexeme frequency effect. Furthermore, in these experiments there are no clear independent effects of the frequencies of the inflected forms. However, the effects of Entropy and Relative Entropy that emerge from these experiments show that in production knowledge of the probabilities of the individual inflected forms do play a role, albeit indirectly. These entropy effects bear witness to the importance of paradigmatic organization of inflected forms in the mental lexicon, both at the level of individual lexemes (Entropy) and at the general level of the class of nouns (Relative Entropy).
  • Ernestus, M., Van Mulken, M., & Baayen, R. H. (2007). Ridders en heiligen in tijd en ruimte: Moderne stylometrische technieken toegepast op Oud-Franse teksten. Taal en Tongval, 58, 1-83.

    Abstract

    This article shows that Old-French literary texts differ systematically in their relative frequencies of syntactic constructions. These frequencies reflect differences in register (poetry versus prose), region (Picardy, Champagne, and Esatern France), time period (until 1250, 1251 – 1300, 1301 – 1350), and genre (hagiography, romance of chivalry, or other).
  • Ernestus, M., & Baayen, R. H. (2007). Paradigmatic effects in auditory word recognition: The case of alternating voice in Dutch. Language and Cognitive Processes, 22(1), 1-24. doi:10.1080/01690960500268303.

    Abstract

    Two lexical decision experiments addressed the role of paradigmatic effects in auditory word recognition. Experiment 1 showed that listeners classified a form with an incorrectly voiced final obstruent more readily as a word if the obstruent is realised as voiced in other forms of that word's morphological paradigm. Moreover, if such was the case, the exact probability of paradigmatic voicing emerged as a significant predictor of the response latencies. A greater probability of voicing correlated with longer response latencies for words correctly realised with voiceless final obstruents. A similar effect of this probability was observed in Experiment 2 for words with completely voiceless or weakly voiced (incompletely neutralised) final obstruents. These data demonstrate the relevance of paradigmatically related complex words for the processing of morphologically simple words in auditory word recognition.
  • Ernestus, M., & Baayen, R. H. (2007). The comprehension of acoustically reduced morphologically complex words: The roles of deletion, duration, and frequency of occurence. In J. Trouvain, & W. J. Barry (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhs 2007) (pp. 773-776). Dudweiler: Pirrot.

    Abstract

    This study addresses the roles of segment deletion, durational reduction, and frequency of use in the comprehension of morphologically complex words. We report two auditory lexical decision experiments with reduced and unreduced prefixed Dutch words. We found that segment deletions as such delayed comprehension. Simultaneously, however, longer durations of the different parts of the words appeared to increase lexical competition, either from the word’s stem (Experiment 1) or from the word’s morphological continuation forms (Experiment 2). Increased lexical competition slowed down especially the comprehension of low frequency words, which shows that speakers do not try to meet listeners’ needs when they reduce especially high frequency words.
  • Ernestus, M., & Baayen, R. H. (2007). Intraparadigmatic effects on the perception of voice. In J. van de Weijer, & E. J. van der Torre (Eds.), Voicing in Dutch: (De)voicing-phonology, phonetics, and psycholinguistics (pp. 153-173). Amsterdam: Benjamins.

    Abstract

    In Dutch, all morpheme-final obstruents are voiceless in word-final position. As a consequence, the distinction between obstruents that are voiced before vowel-initial suffixes and those that are always voiceless is neutralized. This study adds to the existing evidence that the neutralization is incomplete: neutralized, alternating plosives tend to have shorter bursts than non-alternating plosives. Furthermore, in a rating study, listeners scored the alternating plosives as more voiced than the nonalternating plosives, showing sensitivity to the subtle subphonemic cues in the acoustic signal. Importantly, the participants who were presented with the complete words, instead of just the final rhymes, scored the alternating plosives as even more voiced. This shows that listeners’ perception of voice is affected by their knowledge of the obstruent’s realization in the word’s morphological paradigm. Apparently, subphonemic paradigmatic levelling is a characteristic of both production and perception. We explain the effects within an analogy-based approach.
  • Kuperman, V., Pluymaekers, M., Ernestus, M., & Baayen, R. H. (2007). Morphological predictability and acoustic duration of interfixes in Dutch compounds. Journal of the Acoustical Society of America, 121(4), 2261-2271. doi:10.1121/1.2537393.

    Abstract

    This study explores the effects of informational redundancy, as carried by a word's morphological paradigmatic structure, on acoustic duration in read aloud speech. The hypothesis that the more predictable a linguistic unit is, the less salient its realization, was tested on the basis of the acoustic duration of interfixes in Dutch compounds in two datasets: One for the interfix -s- (1155 tokens) and one for the interfix -e(n)- (742 tokens). Both datasets show that the more probable the interfix is, given the compound and its constituents, the longer it is realized. These findings run counter to the predictions of information-theoretical approaches and can be resolved by the Paradigmatic Signal Enhancement Hypothesis. This hypothesis argues that whenever selection of an element from alternatives is probabilistic, the element's duration is predicted by the amount of paradigmatic support for the element: The most likely alternative in the paradigm of selection is realized longer.
  • Kuzla, C., & Ernestus, M. (2007). Prosodic conditioning of phonetic detail of German plosives. In J. Trouvain, & W. J. Barry (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) (pp. 461-464). Dudweiler: Pirrot.

    Abstract

    The present study investigates the influence of prosodic structure on the fine-grained phonetic details of German plosives which also cue the phonological fortis-lenis contrast. Closure durations were found to be longer at higher prosodic boundaries. There was also less glottal vibration in lenis plosives at higher prosodic boundaries. Voice onset time in lenis plosives was not affected by prosody. In contrast, for the fortis plosives VOT decreased at higher boundaries, as did the maximal intensity of the release. These results demonstrate that the effects of prosody on different phonetic cues can go into opposite directions, but are overall constrained by the need to maintain phonological contrasts. While prosodic effects on some cues are compatible with a ‘fortition’ account of prosodic strengthening or with a general feature enhancement explanation, the effects on others enhance paradigmatic contrasts only within a given prosodic position.
  • Kuzla, C., Cho, T., & Ernestus, M. (2007). Prosodic strengthening of German fricatives in duration and assimilatory devoicing. Journal of Phonetics, 35(3), 301-320. doi:10.1016/j.wocn.2006.11.001.

    Abstract

    This study addressed prosodic effects on the duration of and amount of glottal vibration in German word-initial fricatives /f, v, z/ in assimilatory and non-assimilatory devoicing contexts. Fricatives following /small schwa/ (non-assimilation context) were longer and were produced with less glottal vibration after higher prosodic boundaries, reflecting domain-initial prosodic strengthening. After /t/ (assimilation context), lenis fricatives (/v, z/) were produced with less glottal vibration than after /small schwa/, due to assimilatory devoicing. This devoicing was especially strong across lower prosodic boundaries, showing the influence of prosodic structure on sandhi processes. Reduction in glottal vibration made lenis fricatives more fortis-like (/f, s/). Importantly, fricative duration, another major cue to the fortis-lenis distinction, was affected by initial lengthening, but not by assimilation. Hence, at smaller boundaries, fricatives were more devoiced (more fortis-like), but also shorter (more lenis-like). As a consequence, the fortis and lenis fricatives remained acoustically distinct in all prosodic and segmental contexts. Overall, /z/ was devoiced to a greater extent than /v/. Since /z/ does not have a fortis counterpart in word-initial position, these findings suggest that phonotactic restrictions constrain phonetic processes. The present study illuminates a complex interaction of prosody, sandhi processes, and phonotactics, yielding systematic phonetic cues to prosodic structure and phonological distinctions.
  • Scharenborg, O., Ernestus, M., & Wan, V. (2007). Segmentation of speech: Child's play? In H. van Hamme, & R. van Son (Eds.), Proceedings of Interspeech 2007 (pp. 1953-1956). Adelaide: Causal Productions.

    Abstract

    The difficulty of the task of segmenting a speech signal into its words is immediately clear when listening to a foreign language; it is much harder to segment the signal into its words, since the words of the language are unknown. Infants are faced with the same task when learning their first language. This study provides a better understanding of the task that infants face while learning their native language. We employed an automatic algorithm on the task of speech segmentation without prior knowledge of the labels of the phonemes. An analysis of the boundaries erroneously placed inside a phoneme showed that the algorithm consistently placed additional boundaries in phonemes in which acoustic changes occur. These acoustic changes may be as great as the transition from the closure to the burst of a plosive or as subtle as the formant transitions in low or back vowels. Moreover, we found that glottal vibration may attenuate the relevance of acoustic changes within obstruents. An interesting question for further research is how infants learn to overcome the natural tendency to segment these ‘dynamic’ phonemes.
  • Ernestus, M., & Mak, W. M. (2004). Distinctive phonological features differ in relevance for both spoken and written word recognition. Brain and Language, 90(1-3), 378-392. doi:10.1016/S0093-934X(03)00449-8.

    Abstract

    This paper discusses four experiments on Dutch which show that distinctive phonological features differ in their relevance for word recognition. The relevance of a feature for word recognition depends on its phonological stability, that is, the extent to which that feature is generally realized in accordance with its lexical specification in the relevant word position. If one feature value is uninformative, all values of that feature are less relevant for word recognition, with the least informative feature being the least relevant. Features differ in their relevance both in spoken and written word recognition, though the differences are more pronounced in auditory lexical decision than in self-paced reading.
  • Ernestus, M., & Baayen, R. H. (2004). Analogical effects in regular past tense production in Dutch. Linguistics, 42(5), 873-903. doi:10.1515/ling.2004.031.

    Abstract

    This study addresses the question to what extent the production of regular past tense forms in Dutch is a¤ected by analogical processes. We report an experiment in which native speakers of Dutch listened to existing regular verbs over headphones, and had to indicate which of the past tense allomorphs, te or de, was appropriate for these verbs. According to generative analyses, the choice between the two su‰xes is completely regular and governed by the underlying [voice]-specification of the stem-final segment. In this approach, no analogical e¤ects are expected. In connectionist and analogical approaches, by contrast, the phonological similarity structure in the lexicon is expected to a¤ect lexical processing. Our experimental results support the latter approach: all participants created more nonstandard past tense forms, produced more inconsistency errors, and responded more slowly for verbs with stronger analogical support for the nonstandard form.
  • Ernestus, M., & Baayen, R. H. (2004). Kuchde, tobte, en turfte: Lekkage in 't kofschip. Onze Taal, 73(12), 360-361.
  • Kemps, R. J. J. K., Ernestus, M., Schreuder, R., & Baayen, R. H. (2004). Processing reduced word forms: The suffix restoration effect. Brain and Language, 90(1-3), 117-127. doi:10.1016/S0093-934X(03)00425-5.

    Abstract

    Listeners cannot recognize highly reduced word forms in isolation, but they can do so when these forms are presented in context (Ernestus, Baayen, & Schreuder, 2002). This suggests that not all possible surface forms of words have equal status in the mental lexicon. The present study shows that the reduced forms are linked to the canonical representations in the mental lexicon, and that these latter representations induce reconstruction processes. Listeners restore suffixes that are partly or completely missing in reduced word forms. A series of phoneme-monitoring experiments reveals the nature of this restoration: the basis for suffix restoration is mainly phonological in nature, but orthography has an influence as well.
  • Moscoso del Prado Martín, F., Ernestus, M., & Baayen, R. H. (2004). Do type and token effects reflect different mechanisms? Connectionist modeling of Dutch past-tense formation and final devoicing. Brain and Language, 90(1-3), 287-298. doi:10.1016/j.bandl.2003.12.002.

    Abstract

    In this paper, we show that both token and type-based effects in lexical processing can result from a single, token-based, system, and therefore, do not necessarily reflect different levels of processing. We report three Simple Recurrent Networks modeling Dutch past-tense formation. These networks show token-based frequency effects and type-based analogical effects closely matching the behavior of human participants when producing past-tense forms for both existing verbs and pseudo-verbs. The third network covers the full vocabulary of Dutch, without imposing predefined linguistic structure on the input or output words.
  • Rietveld, T., Van Hout, R., & Ernestus, M. (2004). Pitfalls in corpus research. Computers and the Humanities, 38(4), 343-362. doi:10.1007/s10579-004-1919-1.

    Abstract

    This paper discusses some pitfalls in corpus research and suggests solutions on the basis of examples and computer simulations. We first address reliability problems in language transcriptions, agreement between transcribers, and how disagreements can be dealt with. We then show that the frequencies of occurrence obtained from a corpus cannot always be analyzed with the traditional X2 test, as corpus data are often not sequentially independent and unit independent. Next, we stress the relevance of the power of statistical tests, and the sizes of statistically significant effects. Finally, we point out that a t-test based on log odds often provides a better alternative to a X2 analysis based on frequency counts.

Share this page