Presentations

Displaying 1 - 100 of 119
  • Bujok, R., Peeters, D., Meyer, A. S., & Bosker, H. R. (2023). When the beat drops – beat gestures recalibrate lexical stress perception. Talk presented at the 1st International Multimodal Communication Symposium (MMSYM 2023). Barcelona, Spain. 2023-04-26 - 2023-04-28.
  • Bujok, R., Peeters, D., Meyer, A. S., & Bosker, H. R. (2023). Beat gestures can drive recalibration of lexical stress perception. Poster presented at the 5th Phonetics and Phonology in Europe Conference (PaPE 2023), Nijmegen, The Netherlands.
  • Bujok, R., Peeters, D., Meyer, A. S., & Bosker, H. R. (2023). Beat gestures can drive recalibration of lexical stress perception. Poster presented at the Donders Poster Session 2023, Nijmegen, The Netherlands.
  • Hong, Y., Rohrer, P. L., & Bosker, H. R. (2023). Do beat gestures influence audiovisual lexical tone perception in Mandarin?. Poster presented at AMLaP Asia 2023, Hong Kong.
  • Mok, I., Bujok, R., & Bosker, H. R. (2023). Visual articulatory gestures guide audiovisual speech perception of lexical stress but only in noise. Talk presented at the 8th Gesture and Speech in Interaction (GESPIN 2023). Nijmegen, The Netherlands. 2023-09-13 - 2023-09-15.
  • Motiekaitytė, K., Grosseck, O., Wolf, L., Bosker, H. R., Ortega, G., Perlman, M., Peeters, D., & Raviv, L. (2023). Multimodality in emerging communication systems: a virtual reality approach. Poster presented at the 8th Gesture and Speech in Interaction (GESPIN 2023), Nijmegen, The Netherlands.
  • Severijnen, G. G. A., Bosker, H. R., & McQueen, J. M. (2023). Syllable rate drives rate normalization, but is not the only factor. Poster presented at the 20th International Congress of the Phonetic Sciences (ICPhS 2023), Prague, Czech Republic.
  • Severijnen, G., Bosker, H. R., & McQueen, J. M. (2023). Listeners prioritize acoustic information over orthographic information in rate normalization. Poster presented at the 29th Architectures and Mechanisms for Language Processing Conference (AMLaP 2023), Donostia–San Sebastián, Spain.
  • Severijnen, G. G., Bosker, H. R., & McQueen, J. M. (2023). Individual differences in lexical stress in Dutch: An examination of cue weighting in production. Talk presented at the 5th Phonetics and Phonology in Europe Conference (PaPE 2023). Nijmegen, The Netherlands. 2023-06-02 - 2023-06-04.
  • Uluşahin, O., Bosker, H. R., McQueen, J. M., & Meyer, A. S. (2023). No evidence for convergence to sub-phonemic F2 shifts in shadowing. Poster presented at the 20th International Congress of the Phonetic Sciences (ICPhS 2023), Prague, Czech Republic.
  • Uluşahin, O., Bosker, H. R., McQueen, J. M., & Meyer, A. S. (2023). The influence of contextual and talker F0 information on fricative perception. Poster presented at the 5th Phonetics and Phonology in Europe Conference (PaPE 2023), Nijmegen, The Netherlands.
  • Uluşahin, O., Bosker, H. R., McQueen, J. M., & Meyer, A. S. (2023). Listeners converge to fundamental frequency in synchronous speech. Poster presented at the 19th NVP Winter Conference on Brain and Cognition, Egmond aan Zee, The Netherlands.

    Abstract

    Convergence broadly refers to interlocutors’ tendency to progressively sound more like each other over time. Recent empirical work has used various experimental paradigms to observe convergence in voice fundamental frequency (f0). One study used stable mean f0 over trials in a synchronous speech task with manipulated (i.e., high and low) f0 conditions (Bradshaw & McGettigan, 2021). Here, we attempted to replicate this study in Dutch. First, in a reading task, participants read 40 sentences at their own pace to establish f0 baselines. Later, in a synchronous speech task, participants read 80 sentences in synchrony with a speaker whose voice was manipulated ±2st above or below (i.e., for the high and low f0 conditions, respectively) a reference mean f0 value. The reference mean f0 value and the manipulation size were obtained across multiple pre-tests. Our results revealed that the f0 manipulation significantly predicted f0 convergence in both high f0 and low f0 conditions. Furthermore, the proportion of convergers in the sample was larger than those reported by Bradshaw & McGettigan, highlighting the benefits of stimulus optimization. Our study thus provides stronger evidence that the pitch of two talkers tends to converge as they speak together.
  • Bosker, H. R., & Bujok, R. (2022). Recalibrating lexical stress perception with lexical context and manual beat gestures. Talk presented at the 28th Architectures and Mechanisms for Language Processing Conference (AMLaP 2022). York, UK. 2022-09-07 - 2022-09-09.
  • Bujok, R., Meyer, A. S., & Bosker, H. R. (2022). Beat gestures influence audiovisual lexical stress perception, while visible facial cues do not. Poster presented at the 35th Annual Conference on Human Sentence Processing (HSP 2022), Virtual meeting.
  • Bujok, R., Meyer, A. S., & Bosker, H. R. (2022). Visible lexical stress cues on the face do not influence audiovisual speech perception. Talk presented at Speech Prosody 2022. Lisbon, Portugal. 2022-05-23 - 2022-05-26.
  • Bujok, R., Peeters, D., Meyer, A. S., & Bosker, H. R. (2022). Do manual beat gestures recalibrate the perception of lexical stress?. Talk presented at the Psychonomic Society - 63rd Annual Meeting. Boston, USA. 2022-11-17 - 2022-11-20.
  • Bujok, R., Meyer, A. S., & Bosker, H. R. (2022). Not all visual cues to lexical stress affect audiovisual speech perception: beat gestures vs. articulatory cues. Poster presented at IMPRS Conference 2022, Virtual meeting.
  • Bujok, R., Peeters, D., Meyer, A. S., & Bosker, H. R. (2022). Recalibration of lexical stress perception can be driven by visual beat gestures. Talk presented at the Dag van de Fonetiek 2022. Utrecht, NL. 2022-12-16 - 2022-12-16.
  • Law, R., Kaufeld, G., Bosker, H. R., & Martin, A. E. (2022). Cortical tracking of linguistic units at different speech rate. Poster presented at the 18th NVP Winter Conference on Brain and Cognition, Egmond aan Zee, The Netherlands.
  • Papoutsi, C., Frost, R. L. A., & Bosker, H. R. (2022). Statistical learning at a virtual cocktail party. Poster presented at the Experimental Psychology Society (EPS) Meeting, online.
  • Severijnen, G. G., Bosker, H. R., & McQueen, J. M. (2022). Acoustic correlates of Dutch lexical stress re-examined: Spectral tilt is not always more reliable than intensity. Talk presented at Speech Prosody 2022. Lisbon, Portugal. 2022-05-23 - 2022-05-26.
  • Severijnen, G., Bosker, H. R., & McQueen, J. M. (2022). How do “VOORnaam” and “voorNAAM” differ between talkers? A corpus analysis of individual talker differences in lexical stress in Dutch. Poster presented at the 18th Conference on Laboratory Phonology (LabPhon 18), online.
  • Uluşahin, O., Bosker, H. R., McQueen, J. M., & Meyer, A. S. (2022). Both contextual and talker-bound F0 information affect voiceless fricative perception. Talk presented at De Dag van de Fonetiek. Utrecht, The Netherlands. 2022-12-16.
  • Bosker, H. R., & Heffner, C. (2021). Listening to speech rates at a cocktail party. Talk presented at the 62nd Annual Meeting of the Psychonomic Society. online. 2021-11-04 - 2021-11-07.
  • Bujok, R., Meyer, A. S., & Bosker, H. R. (2021). Lexical stress perception is influenced by seeing a talker’s gesture, but not face. Talk presented at the 19th Annual Auditory Perception, Cognition and Action Meeting (APCAM 2021). Virtual meeting. 2021-11-04.
  • Bujok, R., Meyer, A. S., & Bosker, H. R. (2022). The role of visual articulatory vs. gestural cues in audiovisual lexical stress perception. Talk presented at DGfS-Workshop: Visual Communication. New Theoretical and Empirical Developments (ViCom 2022). Virtual meeting. 2022-02-23 - 2022-02-25.
  • Law, R., Kaufeld, G., Bosker, H. R., & Martin, A. E. (2021). Cortical tracking of linguistic units at different speech rates. Poster presented at the 13th Annual Meeting of the Society for the Neurobiology of Language (SNL 2021), online.
  • Papoutsi, C., Bosker, H. R., & Frost, R. L. A. (2021). Statistical learning at a virtual cocktail party. Poster presented at the 62nd Annual Meeting of the Psychonomic Society, online.
  • Bosker, H. R. (2020). A novel tool for automated assessment of listener transcripts in speech intelligibility studies. Talk presented at the 179th Annual Meeting of the Acoustical Society of America. (virtual conference). 2020-12-07 - 2020-12-11.
  • Bosker, H. R. (2020). Automatic assessment of transcript accuracy for speech intelligibility studies. Talk presented at Middag van de Fonetiek 2020. (virtual conference). 2020-12-18.
  • Bosker, H. R., & Peeters, D. (2020). Beat gestures can change what words you hear. Talk presented at the 7th Gesture and Speech Interaction (GESPIN 2020). (virtual conference). 2020-09-07 - 2020-09-09.
  • Bosker, H. R., & Peeters, D. (2020). Beat gestures can make you hear different vowels. Talk presented at the 61st Annual Meeting of the Psychonomic Society. online. 2020-11-19 - 2020-11-22.
  • Bosker, H. R., Badaya, E., & Corley, M. (2020). Discourse markers activate their, like, cohort competitors. Poster presented at the 26th Architectures and Mechanisms for Language Processing Conference (AMLap 2020), Potsdam, Germany.
  • Bosker, H. R., & Peeters, D. (2020). How hands help us hear: Evidence for a manual McGurk Effect. Talk presented at Sinn und Bedeutung 25. London, UK. 2020-09-03 - 2020-09-05.
  • Bosker, H. R., & Cooke, M. (2020). More ‘rhythmic’ speech is more intelligible in noise: Evidence from Lombard-inspired speech modifications. Poster presented at the 179th Annual Meeting of the Acoustical Society of America.
  • Bosker, H. R., & Peeters, D. (2020). Seeing a beat gesture can change what speech sounds you hear. Talk presented at the 26th Architectures and Mechanisms for Language Processing Conference (AMLap 2020). Potsdam, Germany. 2020-09-03 - 2020-09-05.
  • Bosker, H. R., Meyer, A. S., & Maslowski, M. (2020). When speech cues are not integrated immediately: Evidence from the global speech rate effect. Poster presented at the 26th Architectures and Mechanisms for Language Processing Conference (AMLap 2020), Potsdam, Germany.
  • Severijnen, G., Bosker, H. R., & McQueen, J. M. (2020). The role of talker-specific prosody in predictive speech perception. Poster presented at the 26th Architectures and Mechanisms for Language Processing Conference (AMLap 2020), Potsdam, Germany.
  • Bosker, H. R. (2019). Both attended and unattended contexts influence speech perception to the same degree. Talk presented at the Experimental Psychology Society London Meeting. London, UK. 2019-01-03 - 2019-01-04.

    Abstract

    Often, listening to a talker also involves ignoring the speech of other talkers (‘cocktail party’ phenomenon). Although cognitively demanding, we are generally quite successful at ignoring competing speech streams in multi-talker situations. However, the present study demonstrates that acoustic context effects are immune to such attentional modulation.

    This study focused on duration-based context effects, presenting ambiguous target sounds after slow vs. fast contexts. Dutch listeners categorized target sounds with a reduced word-initial syllable (e.g., ambiguous between gegaan “gone” vs. gaan “to go”). In Control Experiments 1-2, participants were observed to miss the reduced syllable when the target sound was preceded by a slow context sentence, reflecting the expected duration-based context effect. In dichotic Experiments 3-5 , two different context talkers were presented to the participants’ two ears. The speech rate of both attended and unattended talkers was found to equally influence target categorization, regardless of whether the attended context was in the same or different voice than the target, and even when participants could watch the attended talker speak.

    These results demonstrate that acoustic context effects are robust against attentional modulation, suggesting that these effects largely operate at a level in the auditory processing hierarchy that precedes attentional stream segregation.
  • Bosker, H. R. (2019). Normalizing speech sounds for surrounding context: Charting the role of neural oscillations [Invited talk]. Talk presented at the Symposium "Auditory Cortical Entrainment in Relation with Language Processing" at ESCoP 2019. Tenerife, Spain. 2019-09-26.
  • Bosker, H. R. (2019). Speech perception is influenced by the speech rate of both attended and unattended sentence contexts [Invited talk]. Talk presented at the 177th Meeting of the Acoustical Society of America, the special session "Context Effects in Speech Perception". Louisville, KY, USA. 2019-05-13 - 2019-05-17.
  • Kaufeld, G., Bosker, H. R., Alday, P. M., Meyer, A. S., & Martin, A. E. (2019). A timescale-specific hierarchy in cortical oscillations during spoken language comprehension. Poster presented at Language and Music in Cognition: Integrated Approaches to Cognitive Systems (Spring School 2019), Cologne, Germany.
  • Kaufeld, G., Bosker, H. R., Alday, P. M., Meyer, A. S., & Martin, A. E. (2019). Structure and meaning entrain neural oscillations: A timescale-specific hierarchy. Poster presented at the 26th Annual meeting of the Cognitive Neuroscience Society (CNS 2019), San Francisco, CA, USA.
  • Rodd, J., Bosker, H. R., Ernestus, M., Meyer, A. S., & Bosch, L. t. (2019). The speech production system is reconfigured to change speaking rate. Poster presented at the 3rd Phonetics and Phonology in Europe conference (PaPe 2019), Lecce, Italy.
  • Rodd, J., Bosker, H. R., Ernestus, M., Meyer, A. S., & Bosch, L. t. (2019). The speech production system is reconfigured to change speaking rate. Poster presented at Crossing the Boundaries: Language in Interaction Symposium, Nijmegen, The Netherlands.

    Abstract

    It is evident that speakers can freely vary stylistic features of their speech, such as speech rate, but how they accomplish this has hardly been studied, let alone implemented in a formal model of speech production. Much as in walking and running, where qualitatively different gaits are required cover the gamut of different speeds, we might predict there to be multiple qualitatively distinct configurations, or ‘gaits’, in the speech planning system that speakers must switch between to alter their speaking rate or style. Alternatively, control might involve continuous modulation of a single ‘gait’. We investigate these possibilities by simulation of a connectionist computational model which mimics the temporal characteristics of observed speech. Different ‘regimes’ (combinations of parameter settings) can be engaged to achieve different speaking rates.

    The model was trained separately for each speaking rate, by an evolutionary optimisation algorithm. The training identified parameter values that resulted in the model to best approximate syllable duration distributions characteristic of each speaking rate.

    In one gait system, the regimes used to achieve fast and slow speech are qualitatively similar, but quantitatively different. In parameter space, they would be arranged along a straight line. Different points along this axis correspond to different speaking rates. In a multiple gait system, this linearity would be missing. Instead, the arrangement of the regimes would be triangular, with no obvious relationship between the regions associated with each gait, and an abrupt shift in parameter values to move from speeds associated with ‘walk-speaking’ to ‘run-speaking’.

    Our model achieved good fits in all three speaking rates. In parameter space, the arrangement of the parameter settings selected for the different speaking rates is non-axial, suggesting that ‘gaits’ are present in the speech planning system.
  • Bosker, H. R., Reinisch, E., & Sjerps, M. J. (2018). Attending fast and slow 'cocktail parties': Unattended speech rates influence perception of an attended talker. Talk presented at Architectures and Mechanisms for Language Processing (AMLaP 2018). Berlin, Germany. 2018-09-06 - 2018-09-08.
  • Bosker, H. R. (2018). An oscillations-based model of speech rate normalization [Invited talk]. Talk presented at the Laboratoire Psychologie de la Perception. Paris, France.
  • Bosker, H. R. (2018). How listeners normalize speech: Evidence from neural oscillations [Invited talk]. Talk presented at the Distinguished Speakers in Language Science Colloquium Series. Saarbrücken, Germany. 2018-01-11.

    Abstract

    Speech is remarkably variable: ask 10 talkers to pronounce the same sentence and you’ll end up with 10 unique, acoustically dissimilar realizations. One way in which the listener copes with this acoustic variability is by normalizing speech segments for surrounding temporal and spectral characteristics. That is, a given speech sound can be perceived differently depending on, for instance, the preceding sentence’s speech rate, or average formant values. I will present evidence that these normalization processes occur very early in perceptual processing. Also, using neuroimaging and psychoacoustic data, I will show that temporal normalization may be explained by a neural mechanism involving cortical theta oscillators phase-locking to the syllabic rate of speech. Thus, I propose a neurobiologically plausible model of acoustic normalization in speech processing.
  • Bosker, H. R. (2018). How listening to language learners is different from listening to natives [Invited talk]. Talk presented at EMLAR XIV - Experimental Methods in Language Acquisition Research. Utrecht, The Netherlands. 2018-04-18 - 2018-04-20.
  • Bosker, H. R. (2018). Neural entrainment influences the sounds you hear. Talk presented at the International Meeting of the Psychonomic Society. Amsterdam, The Netherlands. 2018-05-10 - 2018-05-12.

    Abstract

    When listening to speech, the brain is known to ‘track’ the spoken signal by phase-locking neural oscillations to the syllabic rate of speech. It remains debated, however, whether this neural entrainment actively shapes speech perception or whether it is merely an epiphenomenon of speech processing. This study, presenting neuroimaging (MEG) and psychoacoustic evidence, reveals that entrained oscillations persist for several cycles after the driving rhythm has ceased. This sustained entrainment, in turn, influences the temporal sampling of subsequent speech segments, biasing ambiguous vowels towards long/short percepts. Thus, these experiments demonstrate the influential role of neural entrainment in speech perception.
  • Bosker, H. R., Reinisch, E., & Sjerps, M. J. (2018). Normalizing vowels at a cocktail party. Poster presented at Architectures and Mechanisms for Language Processing (AMLaP 2018), Berlin, Germany.
  • Bosker, H. R., Reinisch, E., & Sjerps, M. J. (2018). Selective attention to a specific talker does not change the effect of surrounding acoustic context. Poster presented at LabPhon16 - Variation, development and impairment: Between phonetics and phonology, Lisbon, Portugal.

    Abstract

    Spoken sentences contain considerable prosodic variation, for instance in their speech rate [1]. One mechanism by which the listener can overcome such variation is by interpreting the durations of speech sounds relative to the surrounding speech rate. Indeed, in a fast context, a durationally ambiguous sound is perceived as longer than in a slow context [2]. In abstractionist models of spoken word comprehension, this process – known as rate normalization – affects pre-lexical representations before abstract phonological representations are accessed [3]. A recent study [4] provided support for such an early perceptual locus of rate normalization. In that study, participants performed a visual search task that induced high (large grid) vs. low (small grid) cognitive load, while listening to fast and slow context sentences. Context sentences were followed by durationally ambiguous targets. Fast sentences were shown to bias target perception towards more ‘long’ target segments than slow contexts. Critically, changes in cognitive load did not modulate this rate effect. These findings support a model in which normalization processes arise early during perceptual processing; too early to be affected by attentional modulation. The present study further evaluated the cognitive locus of normalization processes by testing the influence of another form of attention: auditory stream segregation. Specifically, if listeners are presented with a fast and a slow talker at the same time but in different ears, does explicitly attending to one or the other stream influence target perception? The aforementioned model [4] predicts that selective attention should not influence target perception, since normalization processes should be robust against changes in attention allocation. Alternatively, if attention does modulate normalization processes, two participants, one attending to fast, the other to slow speech, should show different perception. Dutch participants (Expt 1: N=32; Expt 2: N=16; Expt 3: N=16) were presented with 200 fast and slow context sentences of various lengths, followed by a target duration continuum ambiguous between, e.g., short target “geven” /ˈxevə/ give vs. long target “gegeven” /xəˈxevə/ given (i.e., 20 target pairs differing presence/absence of unstressed syllable /xə-/). Critically, in Experiment 1, participants heard two talkers simultaneously (talker and location counter-balanced across participants), one (relatively long) sentence at a fast rate, and one (half as long) sentence at a slow rate (rate varied within participants). Context sentences were followed by ambiguous targets from yet another talker (Fig. 1). Half of the participants was instructed to attend to talker A, while the other half attended to talker B. Thus, participants heard identical auditory stimuli, but varied in which talker they attended to. Debriefing questionnaires and transcriptions of attended talkers in filler trials confirmed that participants successfully attended to one talker, and ignored the other. Nevertheless, no effect of attended rate was found (Fig. 2; p>.9), indicating that modulation of attention did not influence participants’ rate normalization. Control experiments showed that it was possible to obtain rate effects with single talker contexts that were either talker-incongruent (Expt 2) or talker-congruent (Expt 3) with the following target (Fig. 1). In both of these experiments, there was a higher proportion of long target responses following a fast context (Fig. 2). This shows that contextual rate affected the perception of syllabic duration and that talker-congruency with the target did not change the effect. Therefore, in line with [4], the current experiments suggest that normalization processes arise early in perception, and are robust against changes in attention.
  • Bosker, H. R. (2018). The role of rate and rhythm in speech perception [Invited talk]. Talk presented at ENRICH 2018. Berg en Dal, The Netherlands.
  • Kaufeld, G., Naumann, W., Ravenschlag, A., Martin, A. E., & Bosker, H. R. (2018). Contextual speech rate influences morphosyntactic prediction and integration. Talk presented at Architectures and Mechanisms for Language Processing (AMLaP 2018). Berlin, Germany. 2018-09-06 - 2018-09-08.
  • Kaufeld, G., Naumann, W., Martin, A. E., & Bosker, H. R. (2018). Contextual speech rate influences morphosyntactic prediction and integration. Poster presented at LabPhon16 - Variation, development and impairment: Between phonetics and phonology, Lisbon, Portugal.
  • Maslowski, M., Meyer, A. S., & Bosker, H. R. (2018). Do effects of habitual speech rate normalization on perception extend to self?. Talk presented at Psycholinguistics in Flanders (PiF 2018). Ghent, Belgium. 2018-06-04 - 2018-06-05.

    Abstract

    Listeners are known to use contextual speech rate in processing temporally ambiguous speech sounds. For instance, a fast adjacent speech context makes a vowel sound relatively long, whereas a slow context makes it sound relatively short (Reinisch & Sjerps, 2013). Besides the local contextual speech rate, listeners also track talker-specific habitual speech rates (Reinisch, 2016; Maslowski et al., in press). However, effects of one’s own speech rate on the perception of another talker’s speech are yet unexplored. Such effects are potentially important, given that, in dialogue, a listener’s own speech often constitutes the context for the interlocutor’s speech. Three experiments tested the contribution of self-produced speech on perception of the habitual speech rate of another talker. In Experiment 1, one group of participants was instructed to speak fast (high-rate group), whereas another group had to speak slowly (low-rate group; 16 participants per group). The two groups were compared on their perception of ambiguous Dutch /A/-/a:/ vowels embedded in neutral rate speech from another talker. In Experiment 2, the same participants listened to playback of their own speech, whilst evaluating target vowels in neutral rate speech as before. Neither of these experiments provided support for the involvement of self-produced speech in perception of another talker's speech rate. Experiment 3 repeated Experiment 2 with a new participant sample, who did not know the participants from the previous two experiments. Here, a group effect was found on perception of the neutral rate talker. This result replicates the finding of Maslowski et al. that habitual speech rates are perceived relative to each other (i.e., neutral rate sounds fast in the presence of a slower talker and vice versa), with naturally produced speech. Taken together, the findings show that self-produced speech is processed differently from speech produced by others. They carry implications for our understanding of the perceptual and cognitive mechanisms involved in rate-dependent speech perception and the link between production and perception in dialogue settings.
  • Maslowski, M., Meyer, A. S., & Bosker, H. R. (2018). How speech rate normalization affects lexical access. Talk presented at Architectures and Mechanisms for Language Processing (AMLaP 2018). Berlin, Germany. 2018-09-06 - 2018-09-08.
  • Maslowski, M., Meyer, A. S., & Bosker, H. R. (2018). Self-produced speech rate is processed differently from other talkers' rates. Poster presented at the International Workshop on Language Production (IWLP 2018), Nijmegen, The Netherlands.

    Abstract

    Interlocutors perceive phonemic category boundaries relative to talkers’ produced speech rates. For instance, a temporally ambiguous vowel between Dutch short /A/ and long /a:/ sounds short (i.e., as /A/) in a slow speech context, but long in a fast context. Besides the local contextual speech rate, listeners also track talker-specific habitual speech rates (Maslowski et al., in press). However, it is yet unclear whether self-produced speech rate modulates perception of another talker’s habitual rate. Such effects are potentially important, given that, in dialogue, a listener’s own speech often constitutes the context for the interlocutor’s speech. Three experiments addressed this question. In Experiment 1, one group of participants was instructed to speak fast, whereas another group had to speak slowly (16 participants per group). The two groups were then compared on their perception of ambiguous Dutch /A/-/a:/ vowels embedded in neutral rate speech from another talker. In Experiment 2, the same participants listened to playback of their own speech, whilst evaluating target vowels in neutral rate speech as before. Neither of these experiments provided support for the involvement of self-produced speech in perception of another talker's speech rate. Experiment 3 repeated Experiment 2 with a new participant sample, who were unfamiliar with the participants from the previous two experiments. Here, a group effect was found on perception of the neutral rate talker. This result replicates the finding of Maslowski et al. that habitual speech rates are perceived relative to each other (i.e., neutral rate sounds fast in the presence of a slower talker and vice versa), with naturally produced speech. Taken together, the findings show that self-produced speech is processed differently from speech produced by others. They carry implications for our understanding of the link between production and perception in dialogue.
  • Rodd, J., Bosker, H. R., Ernestus, M., & Ten Bosch, L. (2018). A connectionist model of serial order applied to speaking rate control. Poster presented at Computational Linguistics in the Netherlands 28, Nijmegen, The Netherlands.
  • Rodd, J., Bosker, H. R., Meyer, A. S., Ernestus, M., & Ten Bosch, L. (2018). How to speed up and slow down: Speaking rate control to the level of the syllable. Talk presented at the New Observations in Speech and Hearing seminar series, Institute of Phonetics and Speech processing, LMU Munich. Munich, Germany.
  • Rodd, J., Bosker, H. R., Ernestus, M., Meyer, A. S., & Ten Bosch, L. (2018). Run-speaking? Simulations of rate control in speech production. Poster presented at Architectures and Mechanisms for Language Processing (AMLaP 2018), Berlin, Germany.
  • Rodd, J., Bosker, H. R., Ernestus, M., Meyer, A. S., & Ten Bosch, L. (2018). Running or speed-walking? Simulations of speech production at different rates. Poster presented at the International Workshop on Language Production (IWLP 2018), Nijmegen, The Netherlands.

    Abstract

    That speakers can vary their speaking rate is evident, but how they accomplish this has
    hardly been studied. The effortful experience of deviating from one's preferred speaking rate
    might result from shifting between different regimes (system configurations) of the speech
    planning system. This study investigates control over speech rate through simulations of a
    new connectionist computational model of the cognitive process of speech production, derived
    from Dell, Burger and Svec’s (1997) model to fit the temporal characteristics of observed
    speech. We draw an analogy from human movement: the selection of walking and running
    gaits to achieve different movement speeds. Are the regimes of the speech production system
    arranged into multiple ‘gaits’ that resemble walking and running?
    During training of the model, different parameter settings are identified for different speech
    rates, which can be conflated with the regimes of the speech production system. The
    parameters can be considered to be dimensions of a high-dimensional ‘regime space’, in
    which different regimes occupy different parts of the space.
    In a single gait system, the regimes are qualitatively similar, but quantitatively different.
    They are arranged along a straight line through regime space. Different points along this axis
    correspond directly to different speaking rates. In a multiple gait system, the arrangement of
    the regimes is more disperse, with no obvious relationship between the regions associated
    with each gait.
    After training, the model achieved good fits in all three speaking rates, and the parameter
    settings associated with each speaking rate were different. The broad arrangement of the
    parameter settings for the different speaking rates in regime space was non-axial, suggesting
    that ‘gaits’ may be present in the speech planning system.
  • Rodd, J., Bosker, H. R., Ernestus, M., Ten Bosch, L., & Meyer, A. S. (2018). To speed up, turn up the gain: Acoustic evidence of a 'gain-strategy' for speech planning in accelerated and decelerated speech. Poster presented at LabPhon16 - Variation, development and impairment: Between phonetics and phonology, Lisbon, Portugal.
  • Bosker, H. R. (2017). Comparing the evaluation and processing of native and non-native disfluencies. Talk presented at the DISFLUENCY 2017. Louvain-la-Neuve, Belgium. 2017-02-15 - 2017-02-17.
  • Bosker, H. R., & Cooke, M. (2017). Comparing the rhythmic properties of plain and Lombard speech. Poster presented at the Abstraction, Diversity and Speech Dynamics Workshop, Herrsching, Germany.
  • Bosker, H. R., & Kösem, A. (2017). An entrained rhythm’s frequency, not phase, influences temporal sampling of speech. Talk presented at Interspeech 2017. Stockholm, Sweden. 2017-08-20 - 2017-08-24.

    Abstract

    Brain oscillations have been shown to track the slow amplitude fluctuations in speech during comprehension. Moreover, there is evidence that these stimulus-induced cortical rhythms may persist even after the driving stimulus has ceased. However, how exactly this neural entrainment shapes speech perception remains debated. This behavioral study investigated whether and how the frequency and phase of an entrained rhythm would influence the temporal sampling of subsequent speech.

    In two behavioral experiments, participants were presented with slow and fast isochronous tone sequences, followed by Dutch target words ambiguous between as /ɑs/ “ash” (with a short vowel) and aas /a:s/ “bait” (with a long vowel). Target words were presented at various phases of the entrained rhythm. Both experiments revealed effects of the frequency of the tone sequence on target word perception: fast sequences biased listeners to more long /a:s/ responses. However, no evidence for phase effects could be discerned.

    These findings show that an entrained rhythm’s frequency, but not phase, influences the temporal sampling of subsequent speech. These outcomes are compatible with theories suggesting that sensory timing is evaluated relative to entrained frequency. Furthermore, they suggest that phase tracking of (syllabic) rhythms by theta oscillations plays a limited role in speech parsing.
  • Bosker, H. R. (2017). Foreign languages sound fast: Evidence for the 'Gabbling Foreigner Illusion'. Talk presented at the Dutch Association for Phonetic Sciences. Amsterdam, The Netherlands.

    Abstract

    Anecdotal evidence suggests that unfamiliar languages sound faster than one’s native language. Empirical evidence for this impression has come from explicit tempo judgments. However, it is unknown whether such perceived rate differences between native and foreign languages (FLs) have effects on implicit speech processing.

    Our measure of implicit perception was ‘rate normalization’: Dutch and German listeners interpret vowels midway between /ɑ/ and /a:/ more often as /a:/ if the target vowel follows a fast (vs. slow) sentence. We asked whether such a ‘rate normalization’ effect may be observed when the context is not actually faster but simply spoken in a foreign language.

    Dutch and German participants listened to Dutch and German (rate-matched) fast and slow sentences, followed by non-words that contained vowels from an /a-a:/ duration continuum. Participants indicated which vowel they heard (fap vs. faap). Across three experiments, we consistently found that German listeners reported more /a:/ responses after foreign sentences (vs. native), suggesting that foreign sentences were indeed perceived as faster. However, mixed results were found for the Dutch groups. We conclude that the subjective impression that FLs sound fast may have an effect on implicit speech processing, influencing how language learners perceive spoken segments in a FL.
  • Bosker, H. R. (2017). How your own speech rate can change how you listen to others. Poster presented at the Abstraction, Diversity and Speech Dynamics Workshop, Herrsching, Germany.
  • Bosker, H. R. (2017). Neural entrainment persists after stimulation, guiding temporal sampling of subsequent speech. Poster presented at the Neural Oscillations in Speech and Language Processing symposium, Berlin, Germany.
  • Bosker, H. R., & Cooke, M. (2017). Rhythm in plain and Lombard speech. Poster presented at the 9th Speech in Noise Workshop, Oldenburg, Germany.
  • Bosker, H. R. (2017). The role of temporal amplitude modulations in the political arena: Hillary Clinton vs. Donald Trump. Talk presented at Interspeech 2017. Stockholm, Sweden. 2017-08-20 - 2017-08-24.

    Abstract

    Speech is an acoustic signal with inherent amplitude modulations in the 1-9 Hz range. Recent models of speech perception propose that this rhythmic nature of speech is central to speech recognition. Moreover, rhythmic amplitude modulations have been shown to have beneficial effects on language processing and the subjective impression listeners have of the speaker. This study investigated the role of amplitude modulations in the political arena by comparing the speech produced by Hillary Clinton and Donald Trump in the three presidential debates of 2016.

    Inspection of the modulation spectra, revealing the spectral content of the two speakers’ amplitude envelopes after matching for overall intensity, showed considerably greater power in Clinton’s modulation spectra (compared to Trump’s) across the three debates, particularly in the 1-9 Hz range. The findings suggest that Clinton’s speech had a more pronounced temporal envelope with rhythmic amplitude modulations below 9 Hz, with a preference for modulations around 3 Hz. This may be taken as evidence for a more structured temporal organization of syllables in Clinton’s speech, potentially due to more frequent use of preplanned utterances. Outcomes are interpreted in light of the potential beneficial effects of a rhythmic temporal envelope on intelligibility and speaker perception.
  • Does, R., Van Bergen, G., & Bosker, H. R. (2017). Testing the effect of different disfluency distributions on hearer predictions. Poster presented at DETEC 2017; Discourse Expectations: Theoretical, Experimental and Computational Perspectives, Nijmegen, The Netherlands.
  • Maslowski, M., Meyer, A. S., & Bosker, H. R. (2017). When slow speech sounds fast: How the speech rate of one talker influences perception of another talker. Talk presented at the IPS workshop: Abstraction, Diversity, and Speech Dynamics. Herrsching am Ammersee, Germany. 2017-05-03 - 2017-05-05.

    Abstract

    Listeners are continuously exposed to a broad range of speech rates. Earlier work has shown that listeners perceive phonetic category boundaries relative to contextual speech rate. This process of rate-dependent speech perception has been suggested to occur across talker changes, with the speech rate of talker A influencing perception of talker B. This study tested whether a ‘global’ speech rate calculated over multiple talkers and over a longer period of time affected perception of the temporal Dutch vowel contrast /ɑ/-/a:/. First, Experiment 1 demonstrated that listeners more often reported hearing long /a:/ in fast contexts than in ‘neutral rate’ contexts, replicating earlier findings. Then, in Experiment 2, one participant group was exposed to ‘neutral’ speech from talker A intermixed with slow speech from talker B. Another group listened to the same ‘neutral’ speech from talker A, but to fast speech from talker B. Between-group comparison in the ‘neutral’ condition revealed that Group 1 reported more long /a:/ than Group 2, indicating that A’s ‘neutral’ speech sounded faster when B was slower. Finally, Experiment 3 tested whether talking at slow or fast rates oneself elicits the same ‘global’ rate effects. However, no evidence was found that self-produced speech modulated perception of talker A. This study corroborates the idea that ‘global’ rate-dependent effects occur across talkers, but are insensitive to one’s own speech rate. Results are interpreted in light of the general auditory mechanisms thought to underlie rate normalization, with implications for our understanding of dialogue.
  • Maslowski, M., Meyer, A. S., & Bosker, H. R. (2017). Whether long-term tracking of speech affects perception depends on who is talking. Poster presented at the Donders Poster Sessions, Nijmegen, The Netherlands.

    Abstract

    Speech rate is known to modulate perception of temporally ambiguous speech sounds. For instance, a vowel may be perceived as short when the immediate speech context is slow, but as long when the context is fast. Yet, effects of long-term tracking of speech rate are largely unexplored. Two experiments tested whether long-term tracking of rate influences perception of the temporal Dutch vowel contrast /A/-/a:/. In Experiment 1, one low-rate group listened to ‘neutral’ rate speech from talker A and to slow speech from talker B. Another high-rate group was exposed to the same neutral speech from A, but to fast speech from B. Between-group comparison of the ‘neutral’ trials revealed that the low-rate group reported a higher proportion of /a:/ in A’s ‘neutral’ speech, indicating that A sounded faster when B was slow. Experiment 2 tested whether one’s own speech rate also contributes to effects of long-term tracking of rate. Here, talker B’s speech was replaced by playback of participants’ own fast or slow speech. No evidence was found that one’s own voice affected perception of talker A in larger speech contexts. These results carry implications for our understanding of the mechanisms involved in rate-dependent speech perception and of dialogue.
  • Maslowski, M., Meyer, A. S., & Bosker, H. R. (2017). Whether long-term tracking of speech rate affects perception depends on who is talking. Poster presented at Interspeech 2017, Stockholm, Sweden.

    Abstract

    Speech rate is known to modulate perception of temporally ambiguous speech sounds. For instance, a vowel may be perceived as short when the immediate speech context is slow, but as long when the context is fast. Yet, effects of long-term tracking of speech rate are largely unexplored. Two experiments tested whether long-term tracking of rate influences perception of the temporal Dutch vowel contrast /ɑ/-/a:/. In Experiment 1, one low-rate group listened to 'neutral' rate speech from talker A and to slow speech from talker B. Another high-rate group was exposed to the same neutral speech from A, but to fast speech from B. Between-group comparison of the 'neutral' trials revealed that the low-rate group reported a higher proportion of /a:/ in A's 'neutral' speech, indicating that A sounded faster when B was slow. Experiment 2 tested whether one's own speech rate also contributes to effects of long-term tracking of rate. Here, talker B's speech was replaced by playback of participants' own fast or slow speech. No evidence was found that one's own voice affected perception of talker A in larger speech contexts. These results carry implications for our understanding of the mechanisms involved in rate-dependent speech perception and of dialogue.
  • Rodd, J., Bosker, H. R., Ernestus, M., Ten Bosch, L., & Meyer, A. S. (2017). How we regulate speech rate: Phonetic evidence for a 'gain strategy' in speech planning. Poster presented at the Abstraction, Diversity and Speech Dynamics Workshop, Herrsching, Germany.
  • Rodd, J., Bosker, H. R., Ernestus, M., Meyer, A. S., & Ten Bosch, L. (2017). Simulating speaking rate control: A spreading activation model of syllable timing. Poster presented at the Workshop Conversational speech and lexical representations, Nijmegen, The Netherlands.

    Abstract

    Speech can be produced at different rates. The ability to produce faster or slower speech may be thought to result from executive control processes enlisted to modulate lexical selection and phonological encoding stages of speech planning.

    This study used simulations of the model of serial order in language by Dell, Burger and Svec (1997, DBS) to characterise the strategies adopted by speakers when naming pictures at fast, medium and slow prescribed rates. Our new implementation of DBS was able to produce activation patterns that correlated strongly with observed syllable-level timing of disyllabic words from this task.

    For each participant, different speaking rates were associated with different regions of the DBS parameter space. The precise placement of the speaking rates in the parameter space differed markedly between participants. Participants applied broadly the same parameter manipulation to accelerate their speech. This was however not the case for deceleration. Hierarchical clustering revealed two distinct patterns of parameter adjustment employed to decelerate speech, suggesting that deceleration is not necessarily achieved by the inverse process of acceleration. In addition, potential refinements to the DBS model are discussed.
  • Bosker, H. R. (2016). Fast and slow listening: how speech rate shapes perception [Invited talk]. Talk presented at the Institute of Phonetics and Speech Processing. Munich, Germany. 2016.

    Abstract

    Words rarely occur in isolation. Rather, they are produced in rich acoustic contexts including the preceding sentence, speech from other talkers, our own speech, background noise, etc. The temporal properties of the acoustic context (e.g., speech rate) have long been known to influence the perception of subsequent words. For instance, the perception of a Dutch vowel ambiguous between short /ɑ/ and long /a:/ may be biased towards long /a:/ if the vowel is preceded by a precursor with a fast speech rate. Many studies in the literature have investigated this process known as rate normalization, showing that rate normalization is a general auditory phenomenon that occurs early in speech perception. However, few studies have come up with an explanatory mechanism that specifies how rate normalization takes place. In this talk, I will present several studies that support the view of rate normalization as an early general auditory process. Furthermore, I will propose a neural mechanism behind rate normalization, involving entrainment of endogenous neural oscillations to the rhythm of the speech signal. Behavioral and neuroimaging (MEG) experiments will be presented in support of this proposal.
  • Bosker, H. R. (2016). How our own voice influences speech perception. Poster presented at the 2nd Workshop on Psycholinguistic Approaches to Speech Recognition in Adverse Conditions (PASRAC), Nijmegen, The Netherlands.

    Abstract

    In natural communication, our own speech and that of others follow each other in rapid succession. As such, the immediate context of an utterance spoken by our conversational partner includes speech that we produced ourselves moments earlier. Given the close temporal proximity of our own speech to that of others, it is surprising to find that there are hardly any studies investigating whether and how the phonetic properties of our own speech may influence our perception of the speech of others. In contrast, effects of surrounding context are well known in the literature. For example, the perception of an ambiguous Dutch vowel midway between short /ɑ/ and long /a:/ may be shifted towards the perception of long /a:/ by presenting it in a context sentence with a fast speech rate. This temporal context effect, known as rate normalization, seems to be a general auditory process which generalizes across different sound sources. For instance, listening to a talker with a fast speech rate may influence our perception of another talker (Newman & Sawusch, 2009). This raises the question whether producing slow or fast speech rates ourselves may also influence our perception of others. This study investigated effects of our own speech rate on our perception of others through a set of experiments targeting rate normalization. In each experiment, fast and slow context sentences were followed by target words containing a vowel continuum from /ɑ/ to /a:/. Experiment 1 used a standard rate normalization design, with participants listening to fast and slow speech followed by ambiguous target words. The categorization patterns of target words, observed in Experiment 1, replicate previous studies showing that hearing a fast speech rate biases subsequent target perception towards /a:/. In Experiment 2, participants were instructed to produce the context sentences themselves at a specified fast or slow rate, after which the ambiguous target words were immediately presented auditorily. Participants’ categorization data show that the faster participants produced the context sentences, the more they reported to perceive the target vowel /a:/. That is, participants’ own speech rate influenced their perception of subsequent target words. This suggests that phonetic properties of our own voice can change our perception of others (through normalization for one’s own speech rate). Experiment 3 tested whether covert speech production (i.e., silent production in one’s mind) at different rates may also influence subsequent perception. However, this time no effect of the covertly produced fast and slow rates was observed. Together, Experiment 2 and Experiment 3 suggest a central role for self-monitoring of the external (i.e., overt) speech signal. Concluding, this study finds that variation in speech production may induce variation in speech perception, thus carrying implications for our understanding of spoken communication in common dialogue settings. Moreover, it may provide a novel rationale for phonetic convergence in conversation (when two interlocutors converge towards each other’s speech rate). That is, phonetic convergence may not only be beneficial for social integration but also help to avoid interfering effects of (self-produced) divergent speech rates.
  • Bosker, H. R. (2016). Huh? Ik versta je niet.. [Huh? I don’t understand you..]. Talk presented at the Science of Tomorrow lectures. the Hague, The Netherlands.
  • Bosker, H. R., Reinisch, E., & Sjerps, M. J. (2016). Listening under cognitive load makes speech sound fast. Talk presented at the Speech Processing in Realistic Environments Workshop. Groningen, The Netherlands. 2016-01-09.
  • Bosker, H. R. (2016). Neural entrainment as a mechanism behind rate normalization in speech perception. Poster presented at the Nijmegen Lectures 2016, Nijmegen, The Netherlands.

    Abstract

    Speech can be delivered at different rates and, as a consequence, listeners have to normalize the incoming speech signal for the rate at which it was produced. This perceptual process, known as rate normalization, is contrastive in nature: for instance, the perception of an ambiguous Dutch vowel in between short /ɑ/ and long /a:/ is biased towards hearing long /a:/ when preceded by a fast sentence context. Previously, rate normalization has (primarily) been explained in terms of durational contrast: the ambiguous vowel is perceived as longer following a fast context because the ambiguous vowel has a relatively long duration compared to the preceding shorter vowels in the fast context. However, durational contrast cannot easily account for findings of rate normalization induced by non-adjacent speech rate, or rate normalization triggered by speech rate calculated over longer periods of time.

    Therefore, neural entrainment of endogenous theta oscillations to the syllabic rate of the speech signal is considered as a novel mechanism behind rate normalization. Instead of contrasting the target sound to the duration of preceding sounds, it is hypothesized that listeners contrast the target sound to the entrained neural rhythm. In order to compare the two accounts of rate normalization (durational contrast vs. neural entrainment), a behavioral experiment was designed in which participants heard Dutch target words ambiguous between /ɑs/ “ash”and /a:s/ “bait”. These target words were preceded by four types of tone precursors, consisting of tone sequences with either short or long tones (71 vs. 125 ms), and presented at a fast or slow tonal rate (4 vs. 7 Hz). Categorization data show that the precursors’ tonal rate, not tonal duration, influenced listener’s perception towards one of either words. Thus, this finding challenges durational contrast, and supports neural entrainment, as the mechanism responsible for rate normalization.
  • Bosker, H. R. (2016). Our own speech rate influences speech perception. Poster presented at the Language in Interaction Summerschool on Human Language: From Genes and Brains to Behavior, Berg en Dal, The Netherlands.
  • Bosker, H. R. (2016). Our own speech rate influences speech perception. Poster presented at Speech Prosody 2016, Boston, MA, USA.

    Abstract

    During conversation, spoken utterances occur in rich acoustic contexts, including speech produced by our interlocutor(s) and speech we produced ourselves. Prosodic characteristics of the acoustic context have been known to influence speech perception in a contrastive fashion: for instance, a vowel presented in a fast context is perceived to have a longer duration than the same vowel in a slow context. Given the ubiquity of the sound of our own voice, it may be that our own speech rate - a common source of acoustic context - also influences our perception of the speech of others. Two experiments were designed to test this hypothesis. Experiment 1 replicated earlier contextual rate effects by showing that hearing pre-recorded fast or slow context sentences alters the perception of ambiguous Dutch target words. Experiment 2 then extended this finding by showing that talking at a fast or slow rate prior to the presentation of the target words also altered the perception of those words. These results suggest that between-talker variation in speech rate production may induce between-talker variation in speech perception, thus potentially explaining why interlocutors tend to converge on speech rate in dialogue settings.
  • Bosker, H. R., & Reinisch, E. (2016). Testing the ‘Gabbling Foreigner Illusion’: Do foreign languages sound fast?. Poster presented at the 2nd Workshop on Psycholinguistic Approaches to Speech Recognition in Adverse Conditions (PASRAC), Nijmegen, The Netherlands.

    Abstract

    Anecdotal evidence suggests that unfamiliar languages sound faster than one’s native language. This impression has been termed the ‘Gabbling Foreigner Illusion’ (Cutler, 2012; p.338) and is supported by empirical study. For example, German and Japanese listeners consistently overestimate the other language’s speech rate by about 7-9% (Pfitzinger & Tamashima, 2006). Instead of using explicit rate judgments, the present study set out to test whether the reported illusory rate difference between native and foreign languages would have effects on implicit speech processing. Specifically, we used the effect of normalization for speaking rate as a measure of implicit rate perception. To illustrate, Dutch listeners interpret a vowel midway between /ɑ/ (short duration) and /a:/ (long duration) more often as /a:/ if the target word follows a fast (rather than a slow) sentence (Reinisch & Sjerps, 2013). That is, vowel length is perceived contrastively with the rate of the context. The crucial question of our study is whether such an effect may be observed when the context is not actually faster but simply spoken in a foreign language. Dutch and German versions of 30 sentence contexts were recorded by a Dutch-German bilingual. Sentence pairs were semantically similar across languages and matched in number of syllables. Each sentence was linearly compressed or expanded to a fast and slow version with sentence durations matched across languages. Target ‘words’ contained vowels from a duration continuum from /ɑ/ to /a:/ and were nonwords in both languages. Pretests ensured that the vowel continuum was perceived identically by speakers of Dutch and German. In Experiment 1, Dutch and German listeners were presented with all (fast, slow, Dutch, German) sentences followed by the ambiguous targets. Listeners were asked to decide which nonword they heard (e.g., fap vs. faap). The compressed sentences (fast) were expected to trigger more long-vowel responses relative to the expanded (slow) sentences. Similarly, if the ‘Gabbling Foreigner Illusion’ affects speech processing, then listening to one’s foreign language (German for Dutch listeners, and Dutch for Germans) should induce a perceptually faster rate, also leading to more long-vowel responses. Results showed a consistent effect of rate normalization with more ‘long’ responses following the compressed sentences. Moreover, for German listeners, a language effect was found. Foreign (Dutch) sentences triggered more ‘long’ responses than native (German) sentences, suggesting that foreign sentences were indeed perceived as faster than native sentences. However, the opposite was found for the Dutch listeners. For them, their native language (Dutch) sounded faster rather than their foreign language (German). Experiment 2 controlled for additional acoustic properties of the context sentences across the two languages. Even though this manipulation did reduce the language effect in the Dutch group significantly, the overall results were similar: both groups perceived Dutch as faster. Taken together, we conclude that the subjective perception of speaking rate, as suggested by the ‘Gabbling Foreigner Illusion’, may have an effect on speech processing, as shown by the German group. Potential explanations for variation between the two listener groups may be related to varying language proficiency.
  • Bosker, H. R., Reinisch, E., & Sjerps, M. J. (2016). Time flies when you're having fun: Cognitive load makes speech sound fast. Talk presented at the 2nd Workshop on Psycholinguistic Approaches to Speech Recognition in Adverse Conditions (PASRAC). Nijmegen, The Netherlands. 2016-10-31 - 2016-11-01.

    Abstract

    Speech perception in spontaneous conversation typically involves the execution of several concurrent tasks, such as driving a car or searching a menu. This simultaneous attentional and mnemonic processing taxes the cognitive system since it recruits limited central processing resources. How this cognitive load influences speech perception is debated. One account states that cognitive load has detrimental effects on speech perception by disrupting the sublexical (phonetic) encoding of the speech signal. This leads to an ‘impoverished encoding’ (Mattys & Wiget, 2011) of the phonetic cues in the signal, possibly induced by impaired perceptual acuity at the auditory periphery. Another account suggests that cognitive load affects the temporal computation of sensory input. People reliably underestimate durations of sensory input received under cognitive load, including speech (‘shrinking of time’; Block, Hancock, & Zakay, 2010), making spoken segments sound shorter (Casini, Burle, & Nguyen, 2009). This study tested the two accounts of the effects of cognitive load on speech perception (‘impoverished encoding’ and ‘shrinking of time’) by investigating acoustic context effects. The temporal and spectral context in which a particular word occurs influences that word’s perception. For instance, the perception of an ambiguous Dutch vowel midway between /ɑ/ (short duration, low F2) and /a:/ (long duration, high F2) may be biased towards /a:/ by presenting it in a fast context (rate normalization) or a context with a relatively low F2 (spectral normalization; Reinisch & Sjerps, 2013). The ‘impoverished encoding’ account hypothesizes that, when context sentences are presented under cognitive load, the phonetic encoding of the context sentence would be disrupted. As such, the temporal and spectral characteristics of that context sentence should have a reduced influence on the perception of a subsequent target word (cognitive load modulating context effects). Alternatively, the ‘shrinking of time’ account holds that cognitive load leads to an underestimation of the duration of the context sentence, inducing a perceptually faster speech rate. This account would therefore not predict a modulation of context effects under cognitive load but rather an independent effect of this perceived increase in speech rate of the context sentence on target perception (higher proportion of /a:/ responses). In two experiments, participants were presented with context sentences followed by target words containing vowels ambiguous between Dutch /ɑ/ and /a:/. In Experiment 1, the context varied in speech rate (slow or fast); in Experiment 2, the context varied in average F2 (high or low). Crucially, during the presentation of the context sentence (not during target presentation), a concurrent easy or difficult visual search task was administered (low vs. high cognitive load). We found reliable acoustic context effects: contexts with a higher speech rate (Experiment 1) or a lower average F2 (Experiment 2) biased target perception towards /a:/. Moreover, cognitive load did not modulate these temporal or spectral context effects. Rather, a consistent main effect of cognitive load was found: higher cognitive load biased perception towards /a:/. This suggests a perceptual increase in the context’s speech rate under increased cognitive load, providing support for the ‘shrinking of time’ account.
  • Kösem, A., Bosker, H. R., Meyer, A. S., Jensen, O., & Hagoort, P. (2016). Neural entrainment reflects temporal predictions guiding speech comprehension. Poster presented at the Eighth Annual Meeting of the Society for the Neurobiology of Language (SNL 2016), London, UK.

    Abstract

    Speech segmentation requires flexible mechanisms to remain robust to features such as speech rate and pronunciation. Recent hypotheses suggest that low-frequency neural oscillations entrain to ongoing syllabic and phrasal rates, and that neural entrainment provides a speech-rate invariant means to discretize linguistic tokens from the acoustic signal. How this mechanism functionally operates remains unclear. Here, we test the hypothesis that neural entrainment reflects temporal predictive mechanisms. It implies that neural entrainment is built on the dynamics of past speech information: the brain would internalize the rhythm of preceding speech to parse the ongoing acoustic signal at optimal time points. A direct prediction is that ongoing neural oscillatory activity should match the rate of preceding speech even if the stimulation changes, for instance when the speech rate suddenly increases or decreases. Crucially, the persistence of neural entrainment to past speech rate should modulate speech perception. We performed an MEG experiment in which native Dutch speakers listened to sentences with varying speech rates. The beginning of the sentence (carrier window) was either presented at a fast or a slow speech rate, while the last three words (target window) were displayed at an intermediate rate across trials. Participants had to report the perception of the last word of the sentence, which was ambiguous with regards to its vowel duration (short vowel /ɑ/ – long vowel /aː/ contrast). MEG data was analyzed in source space using beamformer methods. Consistent with previous behavioral reports, the perception of the ambiguous target word was influenced by the past speech rate; participants reported more /aː/ percepts after a fast speech rate, and more /ɑ/ after a slow speech rate. During the carrier window, neural oscillations efficiently tracked the dynamics of the speech envelope. During the target window, we observed oscillatory activity that corresponded in frequency to the preceding speech rate. Traces of neural entrainment to the past speech rate were significantly observed in medial prefrontal areas. Right superior temporal cortex also showed persisting oscillatory activity which correlated with the observed perceptual biases: participants whose perception was more influenced by the manipulation in speech rate also showed stronger remaining neural oscillatory patterns. The results show that neural entrainment lasts after rhythmic stimulation. The findings further provide empirical support for oscillatory models of speech processing, suggesting that neural oscillations actively encode temporal predictions for speech comprehension.
  • Kösem, A., Bosker, H. R., Meyer, A. S., Jensen, O., & Hagoort, P. (2016). Neural entrainment to speech rhythms reflects temporal predictions and influences word comprehension. Poster presented at the 20th International Conference on Biomagnetism (BioMag 2016), Seoul, South Korea.
  • Maslowski, M., Bosker, H. R., & Meyer, A. S. (2016). Slow speech can sound fast: How the speech rate of one talker affects perception of another talker. Talk presented at the Donders Discussions 2016. Nijmegen, The Netherlands. 2016-11-24 - 2016-11-25.
  • Maslowski, M., Bosker, H. R., & Meyer, A. S. (2016). Slow speech can sound fast: How the speech rate of one talker has a contrastive effect on the perception of another talker. Poster presented at Architectures and Mechanisms for Language Processing (AMLaP 2016), Bilbao, Spain.

    Abstract

    Listeners are continuously exposed to a broad range of speech rates. Earlier work has shown that listeners perceive phonetic category boundaries relative to contextual speech rate. It has been suggested that this process of speech rate normalization occurs across talker changes. This would predict that the speech rate of talker A influences perception of the rate of another talker B. We assessed this hypothesis by testing effects of speech rate on the perception on the Dutch vowel continuum /A/-/a:/. One participant group was exposed to 'neutral' speech from talker A intermixed with fast speech from talker B. Another group listened to the same speech from talker A, but to slow speech from talker B. We observed a difference in perception of talker A depending on the speech rate of talker B: A's 'neutral' speech was perceived as slow when B spoke faster. These findings corroborate the idea that speech rate normalization occurs across talkers, but they challenge the assumption that listeners average over speech rates from multiple talkers. Instead, they suggest that listeners contrast talker-specific rates.
  • Maslowski, M., Meyer, A. S., & Bosker, H. R. (2016). Slow speech can sound fast: How the speech rate of one talker has a contrastive effect on the perception of another talker. Talk presented at MPI Proudly Presents. Nijmegen, The Netherlands. 2016-06-01.
  • Reinisch, E., & Bosker, H. R. (2016). Does foreign language speech sound faster than one’s native language?. Talk presented at the 2nd workshop on Second Language Prosody (SLaP). Graz, Austria. 2016-11-18 - 2016-11-19.
  • Bosker, H. R., Tjiong, V., Quené, H., Sanders, T., & de Jong, N. H. (2015). Both native and non-native disfluencies trigger listeners’ attention. Poster presented at the 7th Workshop on Disfluency in Spontaneous Speech (DiSS), Edinburgh.
  • Bosker, H. R. (2015). An integrative account of fluency perception. Talk presented at the 8th Anela Applied Linguistics Conference. Egmond aan Zee. 2015-05-22.
  • Bosker, H. R. (2015). How speech rate shapes perception. Talk presented at the Dutch Association for Phonetic Sciences. Utrecht.
  • Bosker, H. R., & Reinisch, E. (2015). Nonnative speech sounds fast: Evidence from speechrate normalization. Poster presented at Architectures and Mechanisms for Language Processing (AMLaP 2015), Malta.
  • Bosker, H. R., & Reinisch, E. (2015). Normalization for speechrate in native and nonnative speech. Talk presented at the 18th International Congress of Phonetic Sciences 2015 [ICPhS XVIII]. Glasgow. 2015-08-10.
  • Bosker, H. R. (2015). The processing and evaluation of fluency in native and non-native speech. Talk presented at the Grote Taaldag. Utrecht. 2015-02-07.
  • Bosker, H. R., Tjiong, J., Quené, H., Sanders, T., & De Jong, N. H. (2014). Both native and non-native disfluencies trigger listeners' attention. Poster presented at the 20th Architectures and Mechanisms for Language Processing Conference (AMLAP 2014), Edinburgh, Scotland.

    Abstract

    Disfluencies (such as uh and uhm) are a common phenomenon in spontaneous speech. Rather than filtering these hesitations from the incoming speech signal, listeners are sensitive to disfluency and have been shown to actually use disfluencies for speech comprehension. For instance, disfluencies have been found to have beneficial effects on listeners’ memory. Accumulating evidence indicates that attentional mechanisms underlie this disfluency effect: upon encountering disfluency, listeners raise their attention to the incoming speech signal. The experiments reported here investigated whether these beneficial effects of disfluency also hold when listening to a non-native speaker. Recent studies on the perception of non-native disfluency suggest that disfluency effects on prediction are attenuated when listening to a non-native speaker. This attenuation may be a result of listeners being familiar with the frequent and more variant incidence of disfluencies in non-native speech. If listeners also modulate the beneficial effect of disfluency on memory when listening to a non-native speaker, it would indicate a certain amount of control on the part of the listener over how disfluencies affect attention, and thus comprehension. Furthermore, it would argue against the hypothesis that disfluencies affect comprehension in a rather automatic fashion (cf. the Temporal Delay Hypothesis). Using the Change Detection Paradigm, we presented participants with three-sentence passages that sometimes contained a filled pause (e.g., “... that the patient with the uh wound was...”). After each passage, participants saw a transcript of the spoken passage in which one word had been substituted (e.g., “wound” > “injury”). In our first experiment, participants were more accurate in recalling words from previously heard speech (i.e., detecting the change) if these words had been preceded by a disfluency (relative to a fluent passage). Our second experiment - using non-native speech materials - demonstrated that non-native uh’s elicited an effect of the same magnitude and in the same direction: when new participants listened to a non-native speaker producing the same passages, they were also more accurate on disfluent (as compared to fluent) trials. These data suggest that, upon encountering a disfluency, listeners raise their attention levels irrespective of the (non-)native identity of the speaker. Whereas listeners have been found to modulate prediction effects of disfluencies when listening to non-native speech, no such modulation was found for memory effects of disfluencies in the present data, thus potentially constraining the role of listener control in disfluency processing. The current study emphasizes the central role of attention in an account of disfluency processing.
  • Bosker, H. R. (2014). Diversity in how listeners cope with variation in speech. Talk presented at the Workshop 'Combining Different Approaches to Linguistic Diversity'. MPI, Nijmegen. 2014-10-31.

Share this page