Hans Rutger Bosker

Presentations

Displaying 1 - 81 of 81
  • Bosker, H. R. (2019). Both attended and unattended contexts influence speech perception to the same degree. Talk presented at the Experimental Psychology Society London Meeting. London, UK. 2019-01-03 - 2019-01-04.

    Abstract

    Often, listening to a talker also involves ignoring the speech of other talkers (‘cocktail party’ phenomenon). Although cognitively demanding, we are generally quite successful at ignoring competing speech streams in multi-talker situations. However, the present study demonstrates that acoustic context effects are immune to such attentional modulation. This study focused on duration-based context effects, presenting ambiguous target sounds after slow vs. fast contexts. Dutch listeners categorized target sounds with a reduced word-initial syllable (e.g., ambiguous between gegaan “gone” vs. gaan “to go”). In Control Experiments 1-2, participants were observed to miss the reduced syllable when the target sound was preceded by a slow context sentence, reflecting the expected duration-based context effect. In dichotic Experiments 3-5 , two different context talkers were presented to the participants’ two ears. The speech rate of both attended and unattended talkers was found to equally influence target categorization, regardless of whether the attended context was in the same or different voice than the target, and even when participants could watch the attended talker speak. These results demonstrate that acoustic context effects are robust against attentional modulation, suggesting that these effects largely operate at a level in the auditory processing hierarchy that precedes attentional stream segregation.
  • Bosker, H. R. (2019). Speech perception is influenced by the speech rate of both attended and unattended sentence contexts [Invited talk]. Talk presented at the 177th Meeting of the Acoustical Society of America, the special session "Context Effects in Speech Perception". Louisville, KY, USA. 2019-05-13 - 2019-05-17.
  • Bosker, H. R. (2019). Normalizing speech sounds for surrounding context: Charting the role of neural oscillations [Invited talk]. Talk presented at the Symposium "Auditory Cortical Entrainment in Relation with Language Processing" at ESCoP 2019. Tenerife, Spain. 2019-09-26.
  • Kaufeld, G., Bosker, H. R., Alday, P. M., Meyer, A. S., & Martin, A. E. (2019). A timescale-specific hierarchy in cortical oscillations during spoken language comprehension. Poster presented at Language and Music in Cognition: Integrated Approaches to Cognitive Systems (Spring School 2019), Cologne, Germany.
  • Kaufeld, G., Bosker, H. R., Alday, P. M., Meyer, A. S., & Martin, A. E. (2019). Structure and meaning entrain neural oscillations: A timescale-specific hierarchy. Poster presented at the 26th Annual meeting of the Cognitive Neuroscience Society (CNS 2019), San Francisco, CA, USA.
  • Rodd, J., Bosker, H. R., Ernestus, M., Meyer, A. S., & Bosch, L. t. (2019). The speech production system is reconfigured to change speaking rate. Poster presented at Crossing the Boundaries: Language in Interaction Symposium, Nijmegen, The Netherlands.

    Abstract

    It is evident that speakers can freely vary stylistic features of their speech, such as speech rate, but how they accomplish this has hardly been studied, let alone implemented in a formal model of speech production. Much as in walking and running, where qualitatively different gaits are required cover the gamut of different speeds, we might predict there to be multiple qualitatively distinct configurations, or ‘gaits’, in the speech planning system that speakers must switch between to alter their speaking rate or style. Alternatively, control might involve continuous modulation of a single ‘gait’. We investigate these possibilities by simulation of a connectionist computational model which mimics the temporal characteristics of observed speech. Different ‘regimes’ (combinations of parameter settings) can be engaged to achieve different speaking rates. The model was trained separately for each speaking rate, by an evolutionary optimisation algorithm. The training identified parameter values that resulted in the model to best approximate syllable duration distributions characteristic of each speaking rate. In one gait system, the regimes used to achieve fast and slow speech are qualitatively similar, but quantitatively different. In parameter space, they would be arranged along a straight line. Different points along this axis correspond to different speaking rates. In a multiple gait system, this linearity would be missing. Instead, the arrangement of the regimes would be triangular, with no obvious relationship between the regions associated with each gait, and an abrupt shift in parameter values to move from speeds associated with ‘walk-speaking’ to ‘run-speaking’. Our model achieved good fits in all three speaking rates. In parameter space, the arrangement of the parameter settings selected for the different speaking rates is non-axial, suggesting that ‘gaits’ are present in the speech planning system.
  • Rodd, J., Bosker, H. R., Ernestus, M., Meyer, A. S., & Bosch, L. t. (2019). The speech production system is reconfigured to change speaking rate. Poster presented at the 3rd Phonetics and Phonology in Europe conference (PaPe 2019), Lecce, Italy.
  • Bosker, H. R., Reinisch, E., & Sjerps, M. J. (2018). Attending fast and slow 'cocktail parties': Unattended speech rates influence perception of an attended talker. Talk presented at Architectures and Mechanisms for Language Processing (AMLaP 2018). Berlin, Germany. 2018-09-06 - 2018-09-08.
  • Bosker, H. R., Reinisch, E., & Sjerps, M. J. (2018). Normalizing vowels at a cocktail party. Poster presented at Architectures and Mechanisms for Language Processing (AMLaP 2018), Berlin, Germany.
  • Bosker, H. R. (2018). An oscillations-based model of speech rate normalization [Invited talk]. Talk presented at the Laboratoire Psychologie de la Perception. Paris, France.
  • Bosker, H. R. (2018). The role of rate and rhythm in speech perception [Invited talk]. Talk presented at ENRICH 2018. Berg en Dal, The Netherlands.
  • Bosker, H. R. (2018). How listeners normalize speech: Evidence from neural oscillations [Invited talk]. Talk presented at the Distinguished Speakers in Language Science Colloquium Series. Saarbrücken, Germany. 2018-01-11.

    Abstract

    Speech is remarkably variable: ask 10 talkers to pronounce the same sentence and you’ll end up with 10 unique, acoustically dissimilar realizations. One way in which the listener copes with this acoustic variability is by normalizing speech segments for surrounding temporal and spectral characteristics. That is, a given speech sound can be perceived differently depending on, for instance, the preceding sentence’s speech rate, or average formant values. I will present evidence that these normalization processes occur very early in perceptual processing. Also, using neuroimaging and psychoacoustic data, I will show that temporal normalization may be explained by a neural mechanism involving cortical theta oscillators phase-locking to the syllabic rate of speech. Thus, I propose a neurobiologically plausible model of acoustic normalization in speech processing.
  • Bosker, H. R. (2018). How listening to language learners is different from listening to natives [Invited talk]. Talk presented at EMLAR XIV - Experimental Methods in Language Acquisition Research. Utrecht, The Netherlands. 2018-04-18 - 2018-04-20.
  • Bosker, H. R. (2018). Neural entrainment influences the sounds you hear. Talk presented at the International Meeting of the Psychonomic Society. Amsterdam, The Netherlands. 2018-05-10 - 2018-05-12.

    Abstract

    When listening to speech, the brain is known to ‘track’ the spoken signal by phase-locking neural oscillations to the syllabic rate of speech. It remains debated, however, whether this neural entrainment actively shapes speech perception or whether it is merely an epiphenomenon of speech processing. This study, presenting neuroimaging (MEG) and psychoacoustic evidence, reveals that entrained oscillations persist for several cycles after the driving rhythm has ceased. This sustained entrainment, in turn, influences the temporal sampling of subsequent speech segments, biasing ambiguous vowels towards long/short percepts. Thus, these experiments demonstrate the influential role of neural entrainment in speech perception.
  • Bosker, H. R., Reinisch, E., & Sjerps, M. J. (2018). Selective attention to a specific talker does not change the effect of surrounding acoustic context. Poster presented at LabPhon16 - Variation, development and impairment: Between phonetics and phonology, Lisbon, Portugal.

    Abstract

    Spoken sentences contain considerable prosodic variation, for instance in their speech rate [1]. One mechanism by which the listener can overcome such variation is by interpreting the durations of speech sounds relative to the surrounding speech rate. Indeed, in a fast context, a durationally ambiguous sound is perceived as longer than in a slow context [2]. In abstractionist models of spoken word comprehension, this process – known as rate normalization – affects pre-lexical representations before abstract phonological representations are accessed [3]. A recent study [4] provided support for such an early perceptual locus of rate normalization. In that study, participants performed a visual search task that induced high (large grid) vs. low (small grid) cognitive load, while listening to fast and slow context sentences. Context sentences were followed by durationally ambiguous targets. Fast sentences were shown to bias target perception towards more ‘long’ target segments than slow contexts. Critically, changes in cognitive load did not modulate this rate effect. These findings support a model in which normalization processes arise early during perceptual processing; too early to be affected by attentional modulation. The present study further evaluated the cognitive locus of normalization processes by testing the influence of another form of attention: auditory stream segregation. Specifically, if listeners are presented with a fast and a slow talker at the same time but in different ears, does explicitly attending to one or the other stream influence target perception? The aforementioned model [4] predicts that selective attention should not influence target perception, since normalization processes should be robust against changes in attention allocation. Alternatively, if attention does modulate normalization processes, two participants, one attending to fast, the other to slow speech, should show different perception. Dutch participants (Expt 1: N=32; Expt 2: N=16; Expt 3: N=16) were presented with 200 fast and slow context sentences of various lengths, followed by a target duration continuum ambiguous between, e.g., short target “geven” /ˈxevə/ give vs. long target “gegeven” /xəˈxevə/ given (i.e., 20 target pairs differing presence/absence of unstressed syllable /xə-/). Critically, in Experiment 1, participants heard two talkers simultaneously (talker and location counter-balanced across participants), one (relatively long) sentence at a fast rate, and one (half as long) sentence at a slow rate (rate varied within participants). Context sentences were followed by ambiguous targets from yet another talker (Fig. 1). Half of the participants was instructed to attend to talker A, while the other half attended to talker B. Thus, participants heard identical auditory stimuli, but varied in which talker they attended to. Debriefing questionnaires and transcriptions of attended talkers in filler trials confirmed that participants successfully attended to one talker, and ignored the other. Nevertheless, no effect of attended rate was found (Fig. 2; p>.9), indicating that modulation of attention did not influence participants’ rate normalization. Control experiments showed that it was possible to obtain rate effects with single talker contexts that were either talker-incongruent (Expt 2) or talker-congruent (Expt 3) with the following target (Fig. 1). In both of these experiments, there was a higher proportion of long target responses following a fast context (Fig. 2). This shows that contextual rate affected the perception of syllabic duration and that talker-congruency with the target did not change the effect. Therefore, in line with [4], the current experiments suggest that normalization processes arise early in perception, and are robust against changes in attention.
  • Kaufeld, G., Naumann, W., Martin, A. E., & Bosker, H. R. (2018). Contextual speech rate influences morphosyntactic prediction and integration. Poster presented at LabPhon16 - Variation, development and impairment: Between phonetics and phonology, Lisbon, Portugal.
  • Kaufeld, G., Naumann, W., Ravenschlag, A., Martin, A. E., & Bosker, H. R. (2018). Contextual speech rate influences morphosyntactic prediction and integration. Talk presented at Architectures and Mechanisms for Language Processing (AMLaP 2018). Berlin, Germany. 2018-09-06 - 2018-09-08.
  • Maslowski, M., Meyer, A. S., & Bosker, H. R. (2018). Do effects of habitual speech rate normalization on perception extend to self?. Talk presented at Psycholinguistics in Flanders (PiF 2018). Ghent, Belgium. 2018-06-04 - 2018-06-05.

    Abstract

    Listeners are known to use contextual speech rate in processing temporally ambiguous speech sounds. For instance, a fast adjacent speech context makes a vowel sound relatively long, whereas a slow context makes it sound relatively short (Reinisch & Sjerps, 2013). Besides the local contextual speech rate, listeners also track talker-specific habitual speech rates (Reinisch, 2016; Maslowski et al., in press). However, effects of one’s own speech rate on the perception of another talker’s speech are yet unexplored. Such effects are potentially important, given that, in dialogue, a listener’s own speech often constitutes the context for the interlocutor’s speech. Three experiments tested the contribution of self-produced speech on perception of the habitual speech rate of another talker. In Experiment 1, one group of participants was instructed to speak fast (high-rate group), whereas another group had to speak slowly (low-rate group; 16 participants per group). The two groups were compared on their perception of ambiguous Dutch /A/-/a:/ vowels embedded in neutral rate speech from another talker. In Experiment 2, the same participants listened to playback of their own speech, whilst evaluating target vowels in neutral rate speech as before. Neither of these experiments provided support for the involvement of self-produced speech in perception of another talker's speech rate. Experiment 3 repeated Experiment 2 with a new participant sample, who did not know the participants from the previous two experiments. Here, a group effect was found on perception of the neutral rate talker. This result replicates the finding of Maslowski et al. that habitual speech rates are perceived relative to each other (i.e., neutral rate sounds fast in the presence of a slower talker and vice versa), with naturally produced speech. Taken together, the findings show that self-produced speech is processed differently from speech produced by others. They carry implications for our understanding of the perceptual and cognitive mechanisms involved in rate-dependent speech perception and the link between production and perception in dialogue settings.
  • Maslowski, M., Meyer, A. S., & Bosker, H. R. (2018). How speech rate normalization affects lexical access. Talk presented at Architectures and Mechanisms for Language Processing (AMLaP 2018). Berlin, Germany. 2018-09-06 - 2018-09-08.
  • Maslowski, M., Meyer, A. S., & Bosker, H. R. (2018). Self-produced speech rate is processed differently from other talkers' rates. Poster presented at the International Workshop on Language Production (IWLP 2018), Nijmegen, The Netherlands.

    Abstract

    Interlocutors perceive phonemic category boundaries relative to talkers’ produced speech rates. For instance, a temporally ambiguous vowel between Dutch short /A/ and long /a:/ sounds short (i.e., as /A/) in a slow speech context, but long in a fast context. Besides the local contextual speech rate, listeners also track talker-specific habitual speech rates (Maslowski et al., in press). However, it is yet unclear whether self-produced speech rate modulates perception of another talker’s habitual rate. Such effects are potentially important, given that, in dialogue, a listener’s own speech often constitutes the context for the interlocutor’s speech. Three experiments addressed this question. In Experiment 1, one group of participants was instructed to speak fast, whereas another group had to speak slowly (16 participants per group). The two groups were then compared on their perception of ambiguous Dutch /A/-/a:/ vowels embedded in neutral rate speech from another talker. In Experiment 2, the same participants listened to playback of their own speech, whilst evaluating target vowels in neutral rate speech as before. Neither of these experiments provided support for the involvement of self-produced speech in perception of another talker's speech rate. Experiment 3 repeated Experiment 2 with a new participant sample, who were unfamiliar with the participants from the previous two experiments. Here, a group effect was found on perception of the neutral rate talker. This result replicates the finding of Maslowski et al. that habitual speech rates are perceived relative to each other (i.e., neutral rate sounds fast in the presence of a slower talker and vice versa), with naturally produced speech. Taken together, the findings show that self-produced speech is processed differently from speech produced by others. They carry implications for our understanding of the link between production and perception in dialogue.
  • Rodd, J., Bosker, H. R., Ernestus, M., & Ten Bosch, L. (2018). A connectionist model of serial order applied to speaking rate control. Poster presented at Computational Linguistics in the Netherlands 28, Nijmegen, The Netherlands.
  • Rodd, J., Bosker, H. R., Ernestus, M., Ten Bosch, L., & Meyer, A. S. (2018). To speed up, turn up the gain: Acoustic evidence of a 'gain-strategy' for speech planning in accelerated and decelerated speech. Poster presented at LabPhon16 - Variation, development and impairment: Between phonetics and phonology, Lisbon, Portugal.
  • Rodd, J., Bosker, H. R., Meyer, A. S., Ernestus, M., & Ten Bosch, L. (2018). How to speed up and slow down: Speaking rate control to the level of the syllable. Talk presented at the New Observations in Speech and Hearing seminar series, Institute of Phonetics and Speech processing, LMU Munich. Munich, Germany.
  • Rodd, J., Bosker, H. R., Ernestus, M., Meyer, A. S., & Ten Bosch, L. (2018). Running or speed-walking? Simulations of speech production at different rates. Poster presented at the International Workshop on Language Production (IWLP 2018), Nijmegen, The Netherlands.

    Abstract

    That speakers can vary their speaking rate is evident, but how they accomplish this has hardly been studied. The effortful experience of deviating from one's preferred speaking rate might result from shifting between different regimes (system configurations) of the speech planning system. This study investigates control over speech rate through simulations of a new connectionist computational model of the cognitive process of speech production, derived from Dell, Burger and Svec’s (1997) model to fit the temporal characteristics of observed speech. We draw an analogy from human movement: the selection of walking and running gaits to achieve different movement speeds. Are the regimes of the speech production system arranged into multiple ‘gaits’ that resemble walking and running? During training of the model, different parameter settings are identified for different speech rates, which can be conflated with the regimes of the speech production system. The parameters can be considered to be dimensions of a high-dimensional ‘regime space’, in which different regimes occupy different parts of the space. In a single gait system, the regimes are qualitatively similar, but quantitatively different. They are arranged along a straight line through regime space. Different points along this axis correspond directly to different speaking rates. In a multiple gait system, the arrangement of the regimes is more disperse, with no obvious relationship between the regions associated with each gait. After training, the model achieved good fits in all three speaking rates, and the parameter settings associated with each speaking rate were different. The broad arrangement of the parameter settings for the different speaking rates in regime space was non-axial, suggesting that ‘gaits’ may be present in the speech planning system.
  • Rodd, J., Bosker, H. R., Ernestus, M., Meyer, A. S., & Ten Bosch, L. (2018). Run-speaking? Simulations of rate control in speech production. Poster presented at Architectures and Mechanisms for Language Processing (AMLaP 2018), Berlin, Germany.
  • Bosker, H. R. (2017). Comparing the evaluation and processing of native and non-native disfluencies. Talk presented at the DISFLUENCY 2017. Louvain-la-Neuve, Belgium. 2017-02-15 - 2017-02-17.
  • Bosker, H. R. (2017). Neural entrainment persists after stimulation, guiding temporal sampling of subsequent speech. Poster presented at the Neural Oscillations in Speech and Language Processing symposium, Berlin, Germany.
  • Bosker, H. R., & Kösem, A. (2017). An entrained rhythm’s frequency, not phase, influences temporal sampling of speech. Talk presented at Interspeech 2017. Stockholm, Sweden. 2017-08-20 - 2017-08-24.

    Abstract

    Brain oscillations have been shown to track the slow amplitude fluctuations in speech during comprehension. Moreover, there is evidence that these stimulus-induced cortical rhythms may persist even after the driving stimulus has ceased. However, how exactly this neural entrainment shapes speech perception remains debated. This behavioral study investigated whether and how the frequency and phase of an entrained rhythm would influence the temporal sampling of subsequent speech. In two behavioral experiments, participants were presented with slow and fast isochronous tone sequences, followed by Dutch target words ambiguous between as /ɑs/ “ash” (with a short vowel) and aas /a:s/ “bait” (with a long vowel). Target words were presented at various phases of the entrained rhythm. Both experiments revealed effects of the frequency of the tone sequence on target word perception: fast sequences biased listeners to more long /a:s/ responses. However, no evidence for phase effects could be discerned. These findings show that an entrained rhythm’s frequency, but not phase, influences the temporal sampling of subsequent speech. These outcomes are compatible with theories suggesting that sensory timing is evaluated relative to entrained frequency. Furthermore, they suggest that phase tracking of (syllabic) rhythms by theta oscillations plays a limited role in speech parsing.
  • Bosker, H. R., & Cooke, M. (2017). Comparing the rhythmic properties of plain and Lombard speech. Poster presented at the Abstraction, Diversity and Speech Dynamics Workshop, Herrsching, Germany.
  • Bosker, H. R. (2017). The role of temporal amplitude modulations in the political arena: Hillary Clinton vs. Donald Trump. Talk presented at Interspeech 2017. Stockholm, Sweden. 2017-08-20 - 2017-08-24.

    Abstract

    Speech is an acoustic signal with inherent amplitude modulations in the 1-9 Hz range. Recent models of speech perception propose that this rhythmic nature of speech is central to speech recognition. Moreover, rhythmic amplitude modulations have been shown to have beneficial effects on language processing and the subjective impression listeners have of the speaker. This study investigated the role of amplitude modulations in the political arena by comparing the speech produced by Hillary Clinton and Donald Trump in the three presidential debates of 2016. Inspection of the modulation spectra, revealing the spectral content of the two speakers’ amplitude envelopes after matching for overall intensity, showed considerably greater power in Clinton’s modulation spectra (compared to Trump’s) across the three debates, particularly in the 1-9 Hz range. The findings suggest that Clinton’s speech had a more pronounced temporal envelope with rhythmic amplitude modulations below 9 Hz, with a preference for modulations around 3 Hz. This may be taken as evidence for a more structured temporal organization of syllables in Clinton’s speech, potentially due to more frequent use of preplanned utterances. Outcomes are interpreted in light of the potential beneficial effects of a rhythmic temporal envelope on intelligibility and speaker perception.
  • Bosker, H. R. (2017). How your own speech rate can change how you listen to others. Poster presented at the Abstraction, Diversity and Speech Dynamics Workshop, Herrsching, Germany.
  • Bosker, H. R. (2017). Foreign languages sound fast: Evidence for the 'Gabbling Foreigner Illusion'. Talk presented at the Dutch Association for Phonetic Sciences. Amsterdam, The Netherlands.

    Abstract

    Anecdotal evidence suggests that unfamiliar languages sound faster than one’s native language. Empirical evidence for this impression has come from explicit tempo judgments. However, it is unknown whether such perceived rate differences between native and foreign languages (FLs) have effects on implicit speech processing. Our measure of implicit perception was ‘rate normalization’: Dutch and German listeners interpret vowels midway between /ɑ/ and /a:/ more often as /a:/ if the target vowel follows a fast (vs. slow) sentence. We asked whether such a ‘rate normalization’ effect may be observed when the context is not actually faster but simply spoken in a foreign language. Dutch and German participants listened to Dutch and German (rate-matched) fast and slow sentences, followed by non-words that contained vowels from an /a-a:/ duration continuum. Participants indicated which vowel they heard (fap vs. faap). Across three experiments, we consistently found that German listeners reported more /a:/ responses after foreign sentences (vs. native), suggesting that foreign sentences were indeed perceived as faster. However, mixed results were found for the Dutch groups. We conclude that the subjective impression that FLs sound fast may have an effect on implicit speech processing, influencing how language learners perceive spoken segments in a FL.
  • Bosker, H. R., & Cooke, M. (2017). Rhythm in plain and Lombard speech. Poster presented at the 9th Speech in Noise Workshop, Oldenburg, Germany.
  • Does, R., Van Bergen, G., & Bosker, H. R. (2017). Testing the effect of different disfluency distributions on hearer predictions. Poster presented at DETEC 2017; Discourse Expectations: Theoretical, Experimental and Computational Perspectives, Nijmegen, The Netherlands.
  • Maslowski, M., Meyer, A. S., & Bosker, H. R. (2017). When slow speech sounds fast: How the speech rate of one talker influences perception of another talker. Talk presented at the IPS workshop: Abstraction, Diversity, and Speech Dynamics. Herrsching am Ammersee, Germany. 2017-05-03 - 2017-05-05.

    Abstract

    Listeners are continuously exposed to a broad range of speech rates. Earlier work has shown that listeners perceive phonetic category boundaries relative to contextual speech rate. This process of rate-dependent speech perception has been suggested to occur across talker changes, with the speech rate of talker A influencing perception of talker B. This study tested whether a ‘global’ speech rate calculated over multiple talkers and over a longer period of time affected perception of the temporal Dutch vowel contrast /ɑ/-/a:/. First, Experiment 1 demonstrated that listeners more often reported hearing long /a:/ in fast contexts than in ‘neutral rate’ contexts, replicating earlier findings. Then, in Experiment 2, one participant group was exposed to ‘neutral’ speech from talker A intermixed with slow speech from talker B. Another group listened to the same ‘neutral’ speech from talker A, but to fast speech from talker B. Between-group comparison in the ‘neutral’ condition revealed that Group 1 reported more long /a:/ than Group 2, indicating that A’s ‘neutral’ speech sounded faster when B was slower. Finally, Experiment 3 tested whether talking at slow or fast rates oneself elicits the same ‘global’ rate effects. However, no evidence was found that self-produced speech modulated perception of talker A. This study corroborates the idea that ‘global’ rate-dependent effects occur across talkers, but are insensitive to one’s own speech rate. Results are interpreted in light of the general auditory mechanisms thought to underlie rate normalization, with implications for our understanding of dialogue.
  • Maslowski, M., Meyer, A. S., & Bosker, H. R. (2017). Whether long-term tracking of speech rate affects perception depends on who is talking. Poster presented at Interspeech 2017, Stockholm, Sweden.

    Abstract

    Speech rate is known to modulate perception of temporally ambiguous speech sounds. For instance, a vowel may be perceived as short when the immediate speech context is slow, but as long when the context is fast. Yet, effects of long-term tracking of speech rate are largely unexplored. Two experiments tested whether long-term tracking of rate influences perception of the temporal Dutch vowel contrast /ɑ/-/a:/. In Experiment 1, one low-rate group listened to 'neutral' rate speech from talker A and to slow speech from talker B. Another high-rate group was exposed to the same neutral speech from A, but to fast speech from B. Between-group comparison of the 'neutral' trials revealed that the low-rate group reported a higher proportion of /a:/ in A's 'neutral' speech, indicating that A sounded faster when B was slow. Experiment 2 tested whether one's own speech rate also contributes to effects of long-term tracking of rate. Here, talker B's speech was replaced by playback of participants' own fast or slow speech. No evidence was found that one's own voice affected perception of talker A in larger speech contexts. These results carry implications for our understanding of the mechanisms involved in rate-dependent speech perception and of dialogue.
  • Maslowski, M., Meyer, A. S., & Bosker, H. R. (2017). Whether long-term tracking of speech affects perception depends on who is talking. Poster presented at the Donders Poster Sessions, Nijmegen, The Netherlands.

    Abstract

    Speech rate is known to modulate perception of temporally ambiguous speech sounds. For instance, a vowel may be perceived as short when the immediate speech context is slow, but as long when the context is fast. Yet, effects of long-term tracking of speech rate are largely unexplored. Two experiments tested whether long-term tracking of rate influences perception of the temporal Dutch vowel contrast /A/-/a:/. In Experiment 1, one low-rate group listened to ‘neutral’ rate speech from talker A and to slow speech from talker B. Another high-rate group was exposed to the same neutral speech from A, but to fast speech from B. Between-group comparison of the ‘neutral’ trials revealed that the low-rate group reported a higher proportion of /a:/ in A’s ‘neutral’ speech, indicating that A sounded faster when B was slow. Experiment 2 tested whether one’s own speech rate also contributes to effects of long-term tracking of rate. Here, talker B’s speech was replaced by playback of participants’ own fast or slow speech. No evidence was found that one’s own voice affected perception of talker A in larger speech contexts. These results carry implications for our understanding of the mechanisms involved in rate-dependent speech perception and of dialogue.
  • Rodd, J., Bosker, H. R., Ernestus, M., Meyer, A. S., & Ten Bosch, L. (2017). Simulating speaking rate control: A spreading activation model of syllable timing. Poster presented at the Workshop Conversational speech and lexical representations, Nijmegen, The Netherlands.

    Abstract

    Speech can be produced at different rates. The ability to produce faster or slower speech may be thought to result from executive control processes enlisted to modulate lexical selection and phonological encoding stages of speech planning. This study used simulations of the model of serial order in language by Dell, Burger and Svec (1997, DBS) to characterise the strategies adopted by speakers when naming pictures at fast, medium and slow prescribed rates. Our new implementation of DBS was able to produce activation patterns that correlated strongly with observed syllable-level timing of disyllabic words from this task. For each participant, different speaking rates were associated with different regions of the DBS parameter space. The precise placement of the speaking rates in the parameter space differed markedly between participants. Participants applied broadly the same parameter manipulation to accelerate their speech. This was however not the case for deceleration. Hierarchical clustering revealed two distinct patterns of parameter adjustment employed to decelerate speech, suggesting that deceleration is not necessarily achieved by the inverse process of acceleration. In addition, potential refinements to the DBS model are discussed.
  • Rodd, J., Bosker, H. R., Ernestus, M., Ten Bosch, L., & Meyer, A. S. (2017). How we regulate speech rate: Phonetic evidence for a 'gain strategy' in speech planning. Poster presented at the Abstraction, Diversity and Speech Dynamics Workshop, Herrsching, Germany.
  • Bosker, H. R. (2016). How our own voice influences speech perception. Poster presented at the 2nd Workshop on Psycholinguistic Approaches to Speech Recognition in Adverse Conditions (PASRAC), Nijmegen, The Netherlands.

    Abstract

    In natural communication, our own speech and that of others follow each other in rapid succession. As such, the immediate context of an utterance spoken by our conversational partner includes speech that we produced ourselves moments earlier. Given the close temporal proximity of our own speech to that of others, it is surprising to find that there are hardly any studies investigating whether and how the phonetic properties of our own speech may influence our perception of the speech of others. In contrast, effects of surrounding context are well known in the literature. For example, the perception of an ambiguous Dutch vowel midway between short /ɑ/ and long /a:/ may be shifted towards the perception of long /a:/ by presenting it in a context sentence with a fast speech rate. This temporal context effect, known as rate normalization, seems to be a general auditory process which generalizes across different sound sources. For instance, listening to a talker with a fast speech rate may influence our perception of another talker (Newman & Sawusch, 2009). This raises the question whether producing slow or fast speech rates ourselves may also influence our perception of others. This study investigated effects of our own speech rate on our perception of others through a set of experiments targeting rate normalization. In each experiment, fast and slow context sentences were followed by target words containing a vowel continuum from /ɑ/ to /a:/. Experiment 1 used a standard rate normalization design, with participants listening to fast and slow speech followed by ambiguous target words. The categorization patterns of target words, observed in Experiment 1, replicate previous studies showing that hearing a fast speech rate biases subsequent target perception towards /a:/. In Experiment 2, participants were instructed to produce the context sentences themselves at a specified fast or slow rate, after which the ambiguous target words were immediately presented auditorily. Participants’ categorization data show that the faster participants produced the context sentences, the more they reported to perceive the target vowel /a:/. That is, participants’ own speech rate influenced their perception of subsequent target words. This suggests that phonetic properties of our own voice can change our perception of others (through normalization for one’s own speech rate). Experiment 3 tested whether covert speech production (i.e., silent production in one’s mind) at different rates may also influence subsequent perception. However, this time no effect of the covertly produced fast and slow rates was observed. Together, Experiment 2 and Experiment 3 suggest a central role for self-monitoring of the external (i.e., overt) speech signal. Concluding, this study finds that variation in speech production may induce variation in speech perception, thus carrying implications for our understanding of spoken communication in common dialogue settings. Moreover, it may provide a novel rationale for phonetic convergence in conversation (when two interlocutors converge towards each other’s speech rate). That is, phonetic convergence may not only be beneficial for social integration but also help to avoid interfering effects of (self-produced) divergent speech rates.
  • Bosker, H. R., & Reinisch, E. (2016). Testing the ‘Gabbling Foreigner Illusion’: Do foreign languages sound fast?. Poster presented at the 2nd Workshop on Psycholinguistic Approaches to Speech Recognition in Adverse Conditions (PASRAC), Nijmegen, The Netherlands.

    Abstract

    Anecdotal evidence suggests that unfamiliar languages sound faster than one’s native language. This impression has been termed the ‘Gabbling Foreigner Illusion’ (Cutler, 2012; p.338) and is supported by empirical study. For example, German and Japanese listeners consistently overestimate the other language’s speech rate by about 7-9% (Pfitzinger & Tamashima, 2006). Instead of using explicit rate judgments, the present study set out to test whether the reported illusory rate difference between native and foreign languages would have effects on implicit speech processing. Specifically, we used the effect of normalization for speaking rate as a measure of implicit rate perception. To illustrate, Dutch listeners interpret a vowel midway between /ɑ/ (short duration) and /a:/ (long duration) more often as /a:/ if the target word follows a fast (rather than a slow) sentence (Reinisch & Sjerps, 2013). That is, vowel length is perceived contrastively with the rate of the context. The crucial question of our study is whether such an effect may be observed when the context is not actually faster but simply spoken in a foreign language. Dutch and German versions of 30 sentence contexts were recorded by a Dutch-German bilingual. Sentence pairs were semantically similar across languages and matched in number of syllables. Each sentence was linearly compressed or expanded to a fast and slow version with sentence durations matched across languages. Target ‘words’ contained vowels from a duration continuum from /ɑ/ to /a:/ and were nonwords in both languages. Pretests ensured that the vowel continuum was perceived identically by speakers of Dutch and German. In Experiment 1, Dutch and German listeners were presented with all (fast, slow, Dutch, German) sentences followed by the ambiguous targets. Listeners were asked to decide which nonword they heard (e.g., fap vs. faap). The compressed sentences (fast) were expected to trigger more long-vowel responses relative to the expanded (slow) sentences. Similarly, if the ‘Gabbling Foreigner Illusion’ affects speech processing, then listening to one’s foreign language (German for Dutch listeners, and Dutch for Germans) should induce a perceptually faster rate, also leading to more long-vowel responses. Results showed a consistent effect of rate normalization with more ‘long’ responses following the compressed sentences. Moreover, for German listeners, a language effect was found. Foreign (Dutch) sentences triggered more ‘long’ responses than native (German) sentences, suggesting that foreign sentences were indeed perceived as faster than native sentences. However, the opposite was found for the Dutch listeners. For them, their native language (Dutch) sounded faster rather than their foreign language (German). Experiment 2 controlled for additional acoustic properties of the context sentences across the two languages. Even though this manipulation did reduce the language effect in the Dutch group significantly, the overall results were similar: both groups perceived Dutch as faster. Taken together, we conclude that the subjective perception of speaking rate, as suggested by the ‘Gabbling Foreigner Illusion’, may have an effect on speech processing, as shown by the German group. Potential explanations for variation between the two listener groups may be related to varying language proficiency.
  • Bosker, H. R. (2016). Huh? Ik versta je niet.. [Huh? I don’t understand you..]. Talk presented at the Science of Tomorrow lectures. the Hague, The Netherlands.
  • Bosker, H. R., Reinisch, E., & Sjerps, M. J. (2016). Listening under cognitive load makes speech sound fast. Talk presented at the Speech Processing in Realistic Environments Workshop. Groningen, The Netherlands. 2016-01-09.
  • Bosker, H. R. (2016). Fast and slow listening: how speech rate shapes perception [Invited talk]. Talk presented at the Institute of Phonetics and Speech Processing. Munich, Germany. 2016.

    Abstract

    Words rarely occur in isolation. Rather, they are produced in rich acoustic contexts including the preceding sentence, speech from other talkers, our own speech, background noise, etc. The temporal properties of the acoustic context (e.g., speech rate) have long been known to influence the perception of subsequent words. For instance, the perception of a Dutch vowel ambiguous between short /ɑ/ and long /a:/ may be biased towards long /a:/ if the vowel is preceded by a precursor with a fast speech rate. Many studies in the literature have investigated this process known as rate normalization, showing that rate normalization is a general auditory phenomenon that occurs early in speech perception. However, few studies have come up with an explanatory mechanism that specifies how rate normalization takes place. In this talk, I will present several studies that support the view of rate normalization as an early general auditory process. Furthermore, I will propose a neural mechanism behind rate normalization, involving entrainment of endogenous neural oscillations to the rhythm of the speech signal. Behavioral and neuroimaging (MEG) experiments will be presented in support of this proposal.
  • Bosker, H. R. (2016). Neural entrainment as a mechanism behind rate normalization in speech perception. Poster presented at the Nijmegen Lectures 2016, Nijmegen, The Netherlands.

    Abstract

    Speech can be delivered at different rates and, as a consequence, listeners have to normalize the incoming speech signal for the rate at which it was produced. This perceptual process, known as rate normalization, is contrastive in nature: for instance, the perception of an ambiguous Dutch vowel in between short /ɑ/ and long /a:/ is biased towards hearing long /a:/ when preceded by a fast sentence context. Previously, rate normalization has (primarily) been explained in terms of durational contrast: the ambiguous vowel is perceived as longer following a fast context because the ambiguous vowel has a relatively long duration compared to the preceding shorter vowels in the fast context. However, durational contrast cannot easily account for findings of rate normalization induced by non-adjacent speech rate, or rate normalization triggered by speech rate calculated over longer periods of time. Therefore, neural entrainment of endogenous theta oscillations to the syllabic rate of the speech signal is considered as a novel mechanism behind rate normalization. Instead of contrasting the target sound to the duration of preceding sounds, it is hypothesized that listeners contrast the target sound to the entrained neural rhythm. In order to compare the two accounts of rate normalization (durational contrast vs. neural entrainment), a behavioral experiment was designed in which participants heard Dutch target words ambiguous between /ɑs/ “ash”and /a:s/ “bait”. These target words were preceded by four types of tone precursors, consisting of tone sequences with either short or long tones (71 vs. 125 ms), and presented at a fast or slow tonal rate (4 vs. 7 Hz). Categorization data show that the precursors’ tonal rate, not tonal duration, influenced listener’s perception towards one of either words. Thus, this finding challenges durational contrast, and supports neural entrainment, as the mechanism responsible for rate normalization.
  • Bosker, H. R. (2016). Our own speech rate influences speech perception. Poster presented at the Language in Interaction Summerschool on Human Language: From Genes and Brains to Behavior, Berg en Dal, The Netherlands.
  • Bosker, H. R. (2016). Our own speech rate influences speech perception. Poster presented at Speech Prosody 2016, Boston, MA, USA.

    Abstract

    During conversation, spoken utterances occur in rich acoustic contexts, including speech produced by our interlocutor(s) and speech we produced ourselves. Prosodic characteristics of the acoustic context have been known to influence speech perception in a contrastive fashion: for instance, a vowel presented in a fast context is perceived to have a longer duration than the same vowel in a slow context. Given the ubiquity of the sound of our own voice, it may be that our own speech rate - a common source of acoustic context - also influences our perception of the speech of others. Two experiments were designed to test this hypothesis. Experiment 1 replicated earlier contextual rate effects by showing that hearing pre-recorded fast or slow context sentences alters the perception of ambiguous Dutch target words. Experiment 2 then extended this finding by showing that talking at a fast or slow rate prior to the presentation of the target words also altered the perception of those words. These results suggest that between-talker variation in speech rate production may induce between-talker variation in speech perception, thus potentially explaining why interlocutors tend to converge on speech rate in dialogue settings.
  • Bosker, H. R., Reinisch, E., & Sjerps, M. J. (2016). Time flies when you're having fun: Cognitive load makes speech sound fast. Talk presented at the 2nd Workshop on Psycholinguistic Approaches to Speech Recognition in Adverse Conditions (PASRAC). Nijmegen, The Netherlands. 2016-10-31 - 2016-11-01.

    Abstract

    Speech perception in spontaneous conversation typically involves the execution of several concurrent tasks, such as driving a car or searching a menu. This simultaneous attentional and mnemonic processing taxes the cognitive system since it recruits limited central processing resources. How this cognitive load influences speech perception is debated. One account states that cognitive load has detrimental effects on speech perception by disrupting the sublexical (phonetic) encoding of the speech signal. This leads to an ‘impoverished encoding’ (Mattys & Wiget, 2011) of the phonetic cues in the signal, possibly induced by impaired perceptual acuity at the auditory periphery. Another account suggests that cognitive load affects the temporal computation of sensory input. People reliably underestimate durations of sensory input received under cognitive load, including speech (‘shrinking of time’; Block, Hancock, & Zakay, 2010), making spoken segments sound shorter (Casini, Burle, & Nguyen, 2009). This study tested the two accounts of the effects of cognitive load on speech perception (‘impoverished encoding’ and ‘shrinking of time’) by investigating acoustic context effects. The temporal and spectral context in which a particular word occurs influences that word’s perception. For instance, the perception of an ambiguous Dutch vowel midway between /ɑ/ (short duration, low F2) and /a:/ (long duration, high F2) may be biased towards /a:/ by presenting it in a fast context (rate normalization) or a context with a relatively low F2 (spectral normalization; Reinisch & Sjerps, 2013). The ‘impoverished encoding’ account hypothesizes that, when context sentences are presented under cognitive load, the phonetic encoding of the context sentence would be disrupted. As such, the temporal and spectral characteristics of that context sentence should have a reduced influence on the perception of a subsequent target word (cognitive load modulating context effects). Alternatively, the ‘shrinking of time’ account holds that cognitive load leads to an underestimation of the duration of the context sentence, inducing a perceptually faster speech rate. This account would therefore not predict a modulation of context effects under cognitive load but rather an independent effect of this perceived increase in speech rate of the context sentence on target perception (higher proportion of /a:/ responses). In two experiments, participants were presented with context sentences followed by target words containing vowels ambiguous between Dutch /ɑ/ and /a:/. In Experiment 1, the context varied in speech rate (slow or fast); in Experiment 2, the context varied in average F2 (high or low). Crucially, during the presentation of the context sentence (not during target presentation), a concurrent easy or difficult visual search task was administered (low vs. high cognitive load). We found reliable acoustic context effects: contexts with a higher speech rate (Experiment 1) or a lower average F2 (Experiment 2) biased target perception towards /a:/. Moreover, cognitive load did not modulate these temporal or spectral context effects. Rather, a consistent main effect of cognitive load was found: higher cognitive load biased perception towards /a:/. This suggests a perceptual increase in the context’s speech rate under increased cognitive load, providing support for the ‘shrinking of time’ account.
  • Kösem, A., Bosker, H. R., Meyer, A. S., Jensen, O., & Hagoort, P. (2016). Neural entrainment reflects temporal predictions guiding speech comprehension. Poster presented at the Eighth Annual Meeting of the Society for the Neurobiology of Language (SNL 2016), London, UK.

    Abstract

    Speech segmentation requires flexible mechanisms to remain robust to features such as speech rate and pronunciation. Recent hypotheses suggest that low-frequency neural oscillations entrain to ongoing syllabic and phrasal rates, and that neural entrainment provides a speech-rate invariant means to discretize linguistic tokens from the acoustic signal. How this mechanism functionally operates remains unclear. Here, we test the hypothesis that neural entrainment reflects temporal predictive mechanisms. It implies that neural entrainment is built on the dynamics of past speech information: the brain would internalize the rhythm of preceding speech to parse the ongoing acoustic signal at optimal time points. A direct prediction is that ongoing neural oscillatory activity should match the rate of preceding speech even if the stimulation changes, for instance when the speech rate suddenly increases or decreases. Crucially, the persistence of neural entrainment to past speech rate should modulate speech perception. We performed an MEG experiment in which native Dutch speakers listened to sentences with varying speech rates. The beginning of the sentence (carrier window) was either presented at a fast or a slow speech rate, while the last three words (target window) were displayed at an intermediate rate across trials. Participants had to report the perception of the last word of the sentence, which was ambiguous with regards to its vowel duration (short vowel /ɑ/ – long vowel /aː/ contrast). MEG data was analyzed in source space using beamformer methods. Consistent with previous behavioral reports, the perception of the ambiguous target word was influenced by the past speech rate; participants reported more /aː/ percepts after a fast speech rate, and more /ɑ/ after a slow speech rate. During the carrier window, neural oscillations efficiently tracked the dynamics of the speech envelope. During the target window, we observed oscillatory activity that corresponded in frequency to the preceding speech rate. Traces of neural entrainment to the past speech rate were significantly observed in medial prefrontal areas. Right superior temporal cortex also showed persisting oscillatory activity which correlated with the observed perceptual biases: participants whose perception was more influenced by the manipulation in speech rate also showed stronger remaining neural oscillatory patterns. The results show that neural entrainment lasts after rhythmic stimulation. The findings further provide empirical support for oscillatory models of speech processing, suggesting that neural oscillations actively encode temporal predictions for speech comprehension.
  • Kösem, A., Bosker, H. R., Meyer, A. S., Jensen, O., & Hagoort, P. (2016). Neural entrainment to speech rhythms reflects temporal predictions and influences word comprehension. Poster presented at the 20th International Conference on Biomagnetism (BioMag 2016), Seoul, South Korea.
  • Maslowski, M., Bosker, H. R., & Meyer, A. S. (2016). Slow speech can sound fast: How the speech rate of one talker has a contrastive effect on the perception of another talker. Poster presented at Architectures and Mechanisms for Language Processing (AMLaP 2016), Bilbao, Spain.

    Abstract

    Listeners are continuously exposed to a broad range of speech rates. Earlier work has shown that listeners perceive phonetic category boundaries relative to contextual speech rate. It has been suggested that this process of speech rate normalization occurs across talker changes. This would predict that the speech rate of talker A influences perception of the rate of another talker B. We assessed this hypothesis by testing effects of speech rate on the perception on the Dutch vowel continuum /A/-/a:/. One participant group was exposed to 'neutral' speech from talker A intermixed with fast speech from talker B. Another group listened to the same speech from talker A, but to slow speech from talker B. We observed a difference in perception of talker A depending on the speech rate of talker B: A's 'neutral' speech was perceived as slow when B spoke faster. These findings corroborate the idea that speech rate normalization occurs across talkers, but they challenge the assumption that listeners average over speech rates from multiple talkers. Instead, they suggest that listeners contrast talker-specific rates.
  • Maslowski, M., Bosker, H. R., & Meyer, A. S. (2016). Slow speech can sound fast: How the speech rate of one talker affects perception of another talker. Talk presented at the Donders Discussions 2016. Nijmegen, The Netherlands. 2016-11-24 - 2016-11-25.
  • Maslowski, M., Meyer, A. S., & Bosker, H. R. (2016). Slow speech can sound fast: How the speech rate of one talker has a contrastive effect on the perception of another talker. Talk presented at MPI Proudly Presents. Nijmegen, The Netherlands. 2016-06-01.
  • Reinisch, E., & Bosker, H. R. (2016). Does foreign language speech sound faster than one’s native language?. Talk presented at the 2nd workshop on Second Language Prosody (SLaP). Graz, Austria. 2016-11-18 - 2016-11-19.
  • Bosker, H. R. (2015). An integrative account of fluency perception. Talk presented at the 8th Anela Applied Linguistics Conference. Egmond aan Zee. 2015-05-22.
  • Bosker, H. R., Tjiong, V., Quené, H., Sanders, T., & de Jong, N. H. (2015). Both native and non-native disfluencies trigger listeners’ attention. Poster presented at the 7th Workshop on Disfluency in Spontaneous Speech (DiSS), Edinburgh.
  • Bosker, H. R., & Reinisch, E. (2015). Nonnative speech sounds fast: Evidence from speechrate normalization. Poster presented at Architectures and Mechanisms for Language Processing (AMLaP 2015), Malta.
  • Bosker, H. R., & Reinisch, E. (2015). Normalization for speechrate in native and nonnative speech. Talk presented at the 18th International Congress of Phonetic Sciences 2015 [ICPhS XVIII]. Glasgow. 2015-08-10.
  • Bosker, H. R. (2015). How speech rate shapes perception. Talk presented at the Dutch Association for Phonetic Sciences. Utrecht.
  • Bosker, H. R. (2015). The processing and evaluation of fluency in native and non-native speech. Talk presented at the Grote Taaldag. Utrecht. 2015-02-07.
  • Bosker, H. R. (2014). Diversity in how listeners cope with variation in speech. Talk presented at the Workshop 'Combining Different Approaches to Linguistic Diversity'. MPI, Nijmegen. 2014-10-31.
  • Bosker, H. R. (2014). Click on thee uh.. carburetor. Processing disfluencies and speaker identity. Talk presented at the Workshop on fluency in native and non-native speech. Utrecht University. 2014-05-22.

    Abstract

    In my doctoral dissertation, I demonstrate that, upon encountering disfluencies such as uh and uhm, listeners may predict that the speaker will refer to a relatively more complex object. This suggests that listeners draw inferences about what caused the speaker’s production difficulty. The effect of speaker identity on these inferences remains debated: do listeners draw inferences based on what the listener himself would find difficult to refer to (egocentric account), or what the listener assumes the particular speaker at hand would find difficult (perspective-taking account)? In this talk, I will examine the studies that have investigated how knowledge about the speaker’s identity affects the processing of disfluency, describe some of my own work in this area, and propose a novel way of discriminating between the two accounts.
  • Bosker, H. R., Tjiong, J., Quené, H., Sanders, T., & De Jong, N. H. (2014). Both native and non-native disfluencies trigger listeners' attention. Poster presented at the 20th Architectures and Mechanisms for Language Processing Conference (AMLAP 2014), Edinburgh, Scotland.

    Abstract

    Disfluencies (such as uh and uhm) are a common phenomenon in spontaneous speech. Rather than filtering these hesitations from the incoming speech signal, listeners are sensitive to disfluency and have been shown to actually use disfluencies for speech comprehension. For instance, disfluencies have been found to have beneficial effects on listeners’ memory. Accumulating evidence indicates that attentional mechanisms underlie this disfluency effect: upon encountering disfluency, listeners raise their attention to the incoming speech signal. The experiments reported here investigated whether these beneficial effects of disfluency also hold when listening to a non-native speaker. Recent studies on the perception of non-native disfluency suggest that disfluency effects on prediction are attenuated when listening to a non-native speaker. This attenuation may be a result of listeners being familiar with the frequent and more variant incidence of disfluencies in non-native speech. If listeners also modulate the beneficial effect of disfluency on memory when listening to a non-native speaker, it would indicate a certain amount of control on the part of the listener over how disfluencies affect attention, and thus comprehension. Furthermore, it would argue against the hypothesis that disfluencies affect comprehension in a rather automatic fashion (cf. the Temporal Delay Hypothesis). Using the Change Detection Paradigm, we presented participants with three-sentence passages that sometimes contained a filled pause (e.g., “... that the patient with the uh wound was...”). After each passage, participants saw a transcript of the spoken passage in which one word had been substituted (e.g., “wound” > “injury”). In our first experiment, participants were more accurate in recalling words from previously heard speech (i.e., detecting the change) if these words had been preceded by a disfluency (relative to a fluent passage). Our second experiment - using non-native speech materials - demonstrated that non-native uh’s elicited an effect of the same magnitude and in the same direction: when new participants listened to a non-native speaker producing the same passages, they were also more accurate on disfluent (as compared to fluent) trials. These data suggest that, upon encountering a disfluency, listeners raise their attention levels irrespective of the (non-)native identity of the speaker. Whereas listeners have been found to modulate prediction effects of disfluencies when listening to non-native speech, no such modulation was found for memory effects of disfluencies in the present data, thus potentially constraining the role of listener control in disfluency processing. The current study emphasizes the central role of attention in an account of disfluency processing.
  • Bosker, H. R., & Quené, H. (2013). How do listeners cope with uhm.. disfluencies in native and non-native speech?. Talk presented at Praat Group. Tilburg, The Netherlands.

    Abstract

    During the next Praat-groep meeting, I will contribute to the notion that it's not just about what you say, but also about how you say it. The focus here is on one particular aspect of speech performance, namely the (dis)fluency of speech. I will present results from two eye-tracking experiments targeting the effect that disfluencies may have on the predictive mechanisms of the listener. I will demonstrate that disfluencies may lead listeners to predict a more complex referent (relative to fluent speech). However, when participants listened to a non-native speaker, no effect of disfluency could be established. These two experiments show that (i) listeners are sensitive to the fluency of speech, and (ii) listeners are flexible in modulating the use of disfluencies based on speaker knowledge.
  • Bosker, H. R., Quené, H., Sanders, T., & De Jong, N. H. (2013). Perceiving the fluency of native and non-native speakers. Talk presented at the 23rd conference of the European Second Language Association (Eurosla 23). Amsterdam, The Netherlands. 2013-08-28 - 2013-08-31.

    Abstract

    Fluency assessment is part of many official language tests (e.g., TOEFL iBT) which evaluate non-native speakers’ language proficiency. Operationalizing the notion of fluency involves disentangling the different factors that influence fluency judgments. One approach to this issue has been the correlational analysis of acoustic measures and subjective judgments: which disfluencies (pauses, fillers, corrections) play a large role in fluency assessment and which do not? This approach has mainly been concerned with non-native speech, since native speakers are commonly considered to be ‘fluent’ in their mother tongue. One example of this approach is a previous study by the authors in which they observed strong correlations between fluency ratings and the pause and speed characteristics of the L2 speech. Disfluencies are, however, not limited to non-native speech: they also occur in native speech. It is as yet unclear whether there are any differences between the disfluencies of natives vs. non-natives, for instance in their relative contribution to fluency ratings. Therefore, the focus of this presentation lies on the perception of fluency in non-native and native speech. Crucially, instead of adopting a correlational approach, phonetic manipulations were applied to native and non-native speech such that causal relationships could be established between speech characteristics and fluency judgments. In two experiments fluency judgments on native and non-native Dutch speech were collected. The stimuli consisted of phonetically manipulated speech: in Experiment 1 the number and duration of silent pauses were manipulated, in Experiment 2 the speed of the speech was altered. The manipulated speech samples were presented to native listeners, who rated them on fluency using a 9-point Likert scale. Linear Mixed Models revealed that (i) natives were rated higher than non-natives, (ii) increasing the number or the duration of silent pauses (Experiment 1) or slowing down the speech (Experiment 2) led to lower fluency judgments, and crucially, (iii) there was no difference in the effects of the manipulations across native and non-native speech. These results suggest that the contributions of pause and speed characteristics to fluency judgments are similar across native and non-native fluency perception. Therefore, human raters judge the fluency characteristics of native and non-native speakers according to the same principles. The next step in our project is to study the online processing of disfluencies. Results from eye-tracking experiments using the Visual World Paradigm will be introduced that investigate the processing of native and non-native disfluencies.
  • Bosker, H. R., Quené, H., Sanders, T., & De Jong, N. H. (2013). Perceiving the fluency of native and non-native speakers. Talk presented at New Sounds. Montreal, Canada.

    Abstract

    Non-native speech is commonly riddled with disfluencies: pauses, uhm’s, corrections, etc. The presence of disfluencies in L2 speech has been demonstrated to strongly affect perceived fluency ratings as given by human raters. But native speakers also halt, uhm, correct or repeat themselves. The current study investigates the difference between the way fluency is assessed in native and non-native speech. Since natives portray less disfluencies in their speech than non-natives, they may therefore receive higher fluency ratings. Another possibility might be that natives are not only rated higher, but that their disfluencies are also weighed differently by fluency raters than non-native disfluencies. One possible reason for this may lie in differential psycholinguistic functions of native vs. non-native disfluencies. In two experiments raters were asked to judge the fluency of native and non-native Dutch speech that had been phonetically manipulated: in Experiment 1 the number and duration of silent pauses were manipulated, in Experiment 2 the speed of the speech was altered. The manipulated speech samples were presented to native listeners, who rated them on fluency using a 9-point Likert scale. Linear Mixed Models revealed that in both experiments (i) natives were rated higher than non-natives, (ii) increasing the number or the duration of silent pauses (Experiment 1) or slowing down the speech (Experiment 2) led to lower fluency judgments, and (iii) crucially, there was no difference in the effects of the manipulations across native and non-native speech. These results suggest that human raters judge the fluency characteristics of native and non-native speakers according to the same principles. The next step in our project is to study L1 and L2 fluency from the perspective of speech processing. Results from eye-tracking experiments focusing on the online processing of native and non-native disfluencies will be introduced.
  • Bosker, H. R. (2013). Perceiving the fluency of native and non-native speakers. Talk presented at the Language and Cognition Group, Leiden Institute for Brain and Cognition (LIBC). Leiden, The Netherlands.

    Abstract

    In this talk I would like to present results from (on-going) eye-tracking experiments, using the Visual World Paradigm, investigating the processing of disfluencies. Since spontaneous speech is strewn with disfluencies (such as uhm’s, silent pauses, repetitions, repairs, etc.), one may pose the question how listeners cope with these disfluencies. Do disfluencies hinder the processing of the content of the speech signal or can they actually be helpful to listeners in predicting what the speaker will say next? I will demonstrate that listeners use disfluencies in reference solution: upon hearing the uh in a sentence like “Click on uh the sewingmachine”, our participants showed more anticipatory looks to low-frequent pictures (e.g., the sewingmachine) as compared to high-frequent pictures (e.g., the hand). In our view, listeners took the uh as a sign that the speaker was having trouble naming an object. These troubles are more likely to occur in naming low-frequent pictures than in naming high-frequent pictures, leading to more anticipatory looks to the low-frequent picture. Our data reveal that listeners are sensitive to disfluencies and that they make use of disfluencies, when listening to a native speaker, to anticipate subsequent content. The next step in our research is to investigate the processing of disfluencies in non-native speech. To illustrate this, I will present some recent data from a study into the supposedly beneficial effects of native and non-native disfluencies on subsequent memory.
  • Bosker, H. R., Quené, H., Sanders, T., & De Jong, N. H. (2013). Perceiving the fluency of native and non-native speakers. Poster presented at the 11th International Symposium for Psycholinguistics (ISP2013), Tenerife, Spain.

    Abstract

    Non-native speech is commonly riddled with disfluencies: pauses, uhm’s, corrections, etc. The presence of disfluencies in L2 speech has been demonstrated to strongly affect perceived fluency ratings as given by human raters. But native speakers also halt, uhm, correct or repeat themselves. The current study investigates the difference between the way fluency is assessed in native and non-native speech. Since natives portray less disfluencies in their speech than non-natives, they may therefore receive higher fluency ratings. Another possibility might be that natives are not only rated higher, but that their disfluencies are also weighed differently by fluency raters than non-native disfluencies. One possible reason for this may lie in differential psycholinguistic functions of native vs. non-native disfluencies. In two experiments raters were asked to judge the fluency of native and non-native Dutch speech that had been phonetically manipulated: in Experiment 1 the number and duration of silent pauses were manipulated, in Experiment 2 the speed of the speech was altered. The manipulated speech samples were presented to native listeners, who rated them on fluency using a 9-point Likert scale. Linear Mixed Models revealed that in both experiments (i) natives were rated higher than non-natives, (ii) increasing the number or the duration of silent pauses (Experiment 1) or slowing down the speech (Experiment 2) led to lower fluency judgments, and (iii) crucially, there was no difference in the effects of the manipulations across native and non-native speech. These results suggest that human raters judge the fluency characteristics of native and non-native speakers according to the same principles. The next step in our project is to study L1 and L2 fluency from the perspective of speech processing. Results from eye-tracking experiments focusing on the online processing of native and non-native disfluencies will be introduced.
  • Bosker, H. R., Quené, H., Sanders, T., & De Jong, N. H. (2013). The processing of disfluencies in native and non-native speech. Talk presented at The 19th Annual Conference on Architectures and Mechanisms for Language Processing (AMLaP 2013). Marseille, France. 2013-09-02 - 2013-09-04.

    Abstract

    Speech comprehension involves extensive use of prediction – “determining what you yourself or your interlocutor is likely to say next” (Pickering & Garrod, in press, p.14). Predictions may be based on the semantics, syntax or phonology of the incoming speech signal. Arnold, Hudson Kam, & Tanenhaus (2007) have convincingly demonstrated that listeners may even base their predictions on the presence of disfluencies. When participants in an eye-tracking experiment heard a disfluent instruction containing a filled pause, they were more likely to fixate an unknown than a known object – a disfluency bias. This suggests that listeners very rapidly draw inferences about the speaker and the possible sources of disfluency. Our current goal is to study the contrast between native and non-native disfluencies in speech comprehension. Non-native speakers have additional reason to be disfluent since they are speaking in their L2. If listeners are aware that non-native disfluencies may have different cognitive origins (for instance low L2 proficiency), the disfluency bias – present in native speech comprehension – may be attenuated when listening to non-native speech. Two eye-tracking studies, using the Visual World Paradigm, were designed to study the processing of native and non-native disfluencies. We presented participants with pictures of either high-frequent (e.g., a hand) or low-frequent objects (e.g., a sewing machine). Pre-recorded instructions from a native or a non-native speaker told participants to click on one of two pictures while participants’ eye movements were recorded. Instructions were either fluent (e.g., “Click on the [target]”) or disfluent (e.g., “Click on ..uh.. the [target]”). When listeners heard disfluent instructions from a native speaker, anticipatory eye-movements towards low-frequent pictures were observed – a disfluency bias. In contrast, when listeners heard a non-native speaker produce the same utterances with a foreign accent, the disfluency bias was attenuated. We conclude that (a) listeners may use disfluencies to draw inferences about speaker difficulty in conceptualization and formulation of the target; and (b) speaker knowledge (hearing a foreign accent) may modulate these inferences - supposedly because of the strong correlation between non-native accent and disfluency.
  • Bosker, H. R. (2012). Perceiving the fluency of native and non-native speakers. Talk presented at the Workshop Fluent speech: Combining Cognitive and Educational Approaches. Utrecht, The Netherlands. 2012-11-12 - 2012-11-13.

    Abstract

    Within the language testing practice, the fluency level of test-takers is commonly assessed by human raters. The process through which raters come to their conclusions has long been subject of research. One approach to this issue has been the correlational analysis of acoustic measures and subjective judgments: which disfluencies (pauses, fillers, corrections) play a large role in fluency assessment and which do not? This approach has mainly been concerned with non-native speech, since native speakers are commonly considered to be ‘fluent’ in their mother tongue. But disfluencies also occur in native speech. Therefore, the focus of this presentation lies on the perception of fluency in non-native and native speech. The first set of experiments involved the analysis of non-native fluency perception. Acoustic measurements of non-native speech were compared against subjective fluency ratings. It was found that fluency raters largely depend on the acoustics of the speech signal, mostly on pauses and speed. Subsequently, the perceptual salience of pauses and speed of speech was evaluated. It was hypothesized that pauses and speed of speech may be easy to perceive in the speech signal and therefore play a large role in fluency ratings. The results showed that perceptual saliency alone could not account for why fluency raters largely depend on pause and speed characteristics. The second set of experiments compared native and non-native fluency using phonetic manipulations. Altering the pause or speed characteristics of the speech signal had strong effects on fluency ratings, but these effects did not differ across native and non-native speech. The observations from these empirical studies lead us to conclude that fluency ratings are largely dependent on the acoustics of the speech signal, that pause and speed characteristics are the main contributors to fluency judgments and that these contributions are similar across native and non-native fluency perception.
  • Bosker, H. R., Quené, H., & De Jong, N. H. (2012). Native and non-native fluency: a fundamental or gradient difference?. Talk presented at the 33th TABU Dag. Groningen, The Netherlands. 2012-06-18 - 2012-06-19.

    Abstract

    In everyday life conversations are riddled with disfluencies: pauses, uhm’s, slow tempo, corrections, repetitions, etc. When assessing the fluency level of a non-native speaker, it has been shown that these acoustic features play a large role. Particularly the pause and speed characteristics of speech contribute much to fluency ratings. But native speakers also portray these symptoms of spontaneous speech and as yet the relationship between native and non-native fluency remains unclear. Native fluency might fundamentally differ from non-native fluency, or it may be a gradient distinction. The current study directly compares the concepts of native and non-native fluency by means of phonetic manipulations. In two experiments, the number and duration of silent pauses (Experiment 1) and the speed of the speech (Experiment 2) were digitally manipulated. Fluency ratings by native listeners on these manipulated speech fragments revealed that increasing the number or the duration of silent pauses both led to a decrease in fluency judgments. Despite the clear gradient difference in fluency level of native versus non-native speakers, no evidence could be found for a difference in the effects of the pause manipulations across native and non-native speech. Results from Experiment 2 will demonstrate whether the same holds for speed manipulations in native and non-native speech. The results from Experiment 1 at least suggest that the notion of fluency is constant across native and non-native speech.
  • Bosker, H. R. (2012). The effect of …silent pauses… on native and non-native fluency perception. Talk presented at Experimental Linguistics Talks Utrecht (ELiTU). Utrecht, The Netherlands. 2012-04-02.

    Abstract

    Fluency assessment is part of many official language tests (e.g., TOEFL iBT) which evaluate non-native speakers’ language proficiency. Operationalizing and validating the notion of fluency involves disentangling the different factors that influence fluency judgments. In a previous study the authors found a primary role in L2 fluency perception for pause characteristics of speech. Therefore, the present experiment was designed to zoom in on the contribution of silent pauses to fluency perception. Native and non-native speech fragments from turns in simulated discussions were recorded. The number and duration of silent pauses in these fragments was manipulated. The manipulations resulted in three conditions: NoPauses (pauses >250ms excised); ShortPauses (pauses >250ms received an altered duration between 250-500ms); LongPauses (pauses >250ms received an altered duration between 750-1000ms). These manipulated native and non-native speech fragments were rated on oral fluency by untrained raters using a Latin Square design. Preliminary results (using Linear Mixed Models) demonstrated that non-native speech was rated as significantly less fluent than the native speech. In both native and non-native speech, the NoPauses condition was rated significantly more fluent than the other conditions. Also, a significant difference was established between ShortPauses and LongPauses. Despite the clear difference in fluency level of native vs. non-native speakers, it is concluded that both the number and duration of silent pauses have an equally strong effect on fluency perception in native and non-native speech. These results suggest that, at least with respect to pauses, the notion of fluency is constant across native and non-native speech.
  • Bosker, H. R., Quené, H., Sanders, T., & De Jong, N. H. (2012). The effect of …silent pauses… on native and non-native fluency perception. Talk presented at the 9th Annual EALTA Conference: Validity in Language Testing and Assessment. Innsbruck, Austria. 2012-05-31 - 2012-06-03.

    Abstract

    Fluency assessment is part of many official language tests (e.g., TOEFL iBT) which evaluate non-native speakers’ language proficiency. Operationalizing and validating the notion of fluency involves disentangling the different factors that influence fluency judgments. In a previous study the authors found a primary role in L2 fluency perception for pause characteristics of speech. Therefore, the present experiment was designed to zoom in on the contribution of silent pauses to fluency perception. Native and non-native speech fragments from turns in simulated discussions were recorded. The number and duration of silent pauses in these fragments was manipulated. The manipulations resulted in three conditions: NoPauses (pauses >250ms excised); ShortPauses (pauses >250ms received an altered duration between 250-500ms); LongPauses (pauses >250ms received an altered duration between 750-1000ms). These manipulated native and non-native speech fragments were rated on oral fluency by untrained raters using a Latin Square design. Preliminary results (using Linear Mixed Models) demonstrated that non-native speech was rated as significantly less fluent than the native speech. In both native and non-native speech, the NoPauses condition was rated significantly more fluent than the other conditions. Also, a significant difference was established between ShortPauses and LongPauses. Despite the clear difference in fluency level of native vs. non-native speakers, it is concluded that both the number and duration of silent pauses have an equally strong effect on fluency perception in native and non-native speech. These results suggest that, at least with respect to pauses, the notion of fluency is constant across native and non-native speech.
  • Bosker, H. R., Quené, H., Sanders, T., & De Jong, N. H. (2012). The effect of …silent pauses… on native and non-native fluency perception. Talk presented at the 4th Junior Research Meeting in Applied Linguistics. Antwerp, Belgium. 2012-03-28 - 2012-03-30.

    Abstract

    Fluency assessment is part of many official language tests (e.g., TOEFL iBT) which evaluate non-native speakers’ language development. In order to operationalise the notion of fluency, the different factors that influence fluency judgments are to be disentangled. In a previous study the authors found a primary role in L2 fluency perception for pause characteristics of speech. Therefore, the present experiment was designed to zoom in on the contribution of silent pauses to fluency perception, both in native and non-native speech. Speech fragments from turns in simulated discussions were recorded and digitally manipulated. The manipulations resulted in three conditions: NoPauses (pauses >250ms excised); ShortPauses (pauses >250ms received an altered duration between 250-500ms); LongPauses (pauses >250ms received an altered duration between 750-1000ms). These manipulated native and non-native speech fragments were rated on oral fluency by untrained raters using a Latin Square design. Linear Mixed Models of the subjective fluency ratings demonstrated that i) non-native speech was rated as significantly less fluent than native speech, ii) the NoPauses condition was rated significantly more fluent than the other conditions, iii) the effect of silent pauses did not differ in L2 speech relative to L1 speech. Despite the clear difference in fluency level of native vs. non-native speakers, it is concluded that both the number and duration of silent pauses have an equally strong effect on fluency perception in native and non-native speech. These results suggest that, at least with respect to pauses, the notion of fluency is constant across native and non-native speech.
  • Bosker, H. R. (2011). When is speech fluent? The relationship between acoustic speech properties and subjective fluency ratings. Talk presented at The Language Acquisition Group, Max Planck Institute for Psycholinguistics. Nijmegen, The Netherlands.

    Abstract

    The oral fluency level of an L2 speaker is often used as an important measure in assessing language proficiency. In order to improve the objectivity of such language tests, previous studies have attempted to determine the acoustic correlates of fluency (e.g., Cucchiarini et al. 2002). The results of such studies are difficult to interpret since many of these studies have used multifaceted and intercorrelated measures of speech. An example of such a measure is speech rate which is related to both the speed of articulation and the use of pauses. If we want to discern the separate contributions of speed and pausing to fluency judgments, more precise measures are necessary to reveal more subtleties in perceived fluency ratings. Also, we wanted to see to what extent the relationship between acoustic measures and fluency ratings is dependent on the sensitivity of listeners to such speech phenomena. Our experiment investigated fluency perception by first establishing what speech properties listeners are most sensitive to. Three groups of listeners rated the same set of L2 Dutch speech stimuli on either the use of (silent and filled) pauses, speed of delivery or the use of repairs (corrections and repetitions). Stimuli were 20-seconds excerpts from turns in a simulated discussion. Using linear mixed models the subjective ratings were modelled using non-confounded acoustic measures which only measured one of three aspects of fluency: pausing, speed or repair. Very explicit test instructions resulted in high interrater reliability. Of the three rater groups the ratings from the ‘pause group’ were best predicted by our linear mixed models as evidenced by high explained speaker variance. It is concluded that raters are most sensitive to the use of pauses in speech. A fourth group of listeners rated the same stimuli on overall fluency. Modelling these ratings using only pause measures as predictors already resulted in high explained speaker variance. It is concluded that pause measures are the best acoustic correlates of fluency. Our results will be related to previous literature and a recent follow-up experiment further investigating the relationship between silent pauses and fluency ratings in both L1 and L2 speech will be introduced.
  • Bosker, H. R., Pinget, A.-F., Quené, H., Sanders, T., & De Jong, N. H. (2011). When is speech fluent? The relationship between acoustic speech properties and subjective fluency ratings. Poster presented at 12th NVP Winter Conference on Cognition, Brain, and Behaviour (Dutch Psychonomic Society, Egmond aan Zee, The Netherlands.

    Abstract

    The oral fluency level of an L2 speaker is often used as an important measure in language tests. Arguing that fluency ratings are dependent on the perception of acoustic speech characteristics, Experiment 1 investigated which speech properties raters are most sensitive to. Three groups of listeners rated the same set of L2 Dutch speech stimuli on, respectively, the use of pauses, speed of delivery or the use of corrections and repetitions. Using linear mixed models the subjective ratings were modelled by clusters of acoustic measures which only measured one aspect of fluency (pause, speed or repairs). Listeners were shown to be most sensitive to pause characteristics of speech. A fourth group of listeners rated the same stimuli on overall fluency. The variability of these ratings was best modelled by pause measures. It is concluded that pause measures are best candidates for acoustic correlates of fluency. Therefore, Experiment 2 investigates the independent effects of the number and duration of silent pauses in L1 and L2 speech. By comparing the ratings on speech stimuli that have been manipulated in the number and/or duration of silent pauses, this experiment reveals what effect silent pauses have on fluency perception in L1 and L2 speech.
  • Bosker, H. R., Pinget, A.-F., Quené, H., Sanders, T., & De Jong, N. H. (2011). When is speech fluent? The relationship between acoustic speech properties and subjective fluency ratings. Poster presented at the Workshop Production and Comprehension of Conversational Speech, Nijmegen, The Netherlands.

    Abstract

    The oral fluency level of an L2 speaker is often used as an important measure in assessing language proficiency. In order to improve the objectivity of such language tests, previous studies have attempted to determine the acoustic correlates of fluency (e.g., Cucchiarini et al. 2002). Many of these studies have used multifaceted global measures making the results often difficult to interpret. An example of such a measure is overall speech rate which is confounded because it relates both to speed of articulation and to the use of pauses. Also there is within the literature much diversity in the type of instructions raters were given. Arguing that fluency ratings are dependent on the perception of the acoustic characteristics of speech, Experiment 1 investigated fluency perception by establishing what speech properties raters are capable of perceiving. Three groups of listeners rated the same set of L2 Dutch speech stimuli on either the use of pauses, speed of delivery or the use of repairs (corrections and repetitions). Stimuli were 20sec excerpts from turns in a simulated discussion. Using linear mixed models the subjective ratings were modelled by non-confounded acoustic measures which only measured one aspect of fluency (pause, speed or repairs). Explicit and very specific test instructions resulted in high interrater reliability. Most of the variability of the ratings from the pause group and the speed group was accounted for by pause or speed measures, respectively. Concluding that raters are capable of perceiving and rating pause and speed phenomena (but repair phenomena to a lesser extent), a fourth group of listeners rated the same stimuli on overall fluency. The variability of these ratings was best modelled by pause and speed measures. It is concluded that pause and speed measures are better acoustic correlates of fluency than repair measures. Considering the strong effect of pause measures on fluency perception, Experiment 2 investigates the independent effects of the number of silent pauses and the duration of silent pauses, both in L1 and in L2 speech. Instead of looking at correlations, this experiment attempts to establish a clear causal relationship between these two acoustic speech properties and fluency ratings. By comparing the ratings on identical stimuli differing only in the number or the duration of silent pauses, this experiment reveals whether the number of silent pauses and/or their duration have any effect on fluency perception, both in L1 and in L2 speech.
  • Bosker, H. R., Pinget, A.-F., Quené, H., Sanders, T., & De Jong, N. H. (2011). Waar rekenen we T2 spraak op af? Het effect van instructies op vloeiendheidsoordelen. Talk presented at the 12th VIOT conference: Taalgebruik en Diversiteit. Leiden, The Netherlands. 2011-12-21 - 2011-12-23.

    Abstract

    T2-sprekers van het Nederlands worden vaak als ‘minder vloeiend’ gekarakteriseerd. Maar wat betekent het om ‘vloeiend’ te spreken? Zijn er bepaalde akoestische kenmerken van spraak aan te wijzen die hiervoor doorslaggevend zijn? Experiment 1 onderzocht allereerst de sensitiviteit van luisteraars voor specifieke akoestische spraakeigenschappen. Drie groepen luisteraars beoordeelden dezelfde set van T2-Nederlandse opnames op ofwel het gebruik van pauzes, de spreeksnelheid of het gebruik van herhalingen en correcties. Wanneer hun subjectieve oordelen werden gemodelleerd met akoestische maten als predictoren, bleek dat luisteraars het meest sensitief waren voor pauze karakteristieken. Wanneer vervolgens een vierde luisteraarsgroep dezelfde stimuli beoordeelden op algehele vloeiendheid, bleek dat hun subjectieve vloeiendheidsoordelen het beste gemodelleerd konden worden met pauze predictoren. De auteurs concluderen dat maten van pauze de beste akoestische correlaten zijn van vloeiendheidsperceptie. Voorlopige resultaten van een tweede experiment, waarin de duur en aantal van pauzes in T1- en T2-spraak stelselmatig worden gemanipuleerd, zullen worden besproken.
  • Bosker, H. R., Quené, H., Pinget, A.-F., & De Jong, N. H. (2011). What do oral fluency raters listen to? The effect of instructions on fluency ratings. Poster presented at the 21st Annual Conference of the European Second Language Association, Stockholm, Sweden.

    Abstract

    The degree of oral fluency of a non-native (L2) speaker is an important measure in assessing language proficiency. Previous studies have analysed listeners' subjective ratings and have attempted to relate these ratings to objective acoustic measurements of the stimuli. Across these studies, however, there is much diversity in the instructions given to raters, even though it is unknown what role these instructions play. For example, instructions to rate fluency by listening for pauses may influence raters to such an extent, that they only attend to the pauses in the speech while disregarding other cues of oral fluency. In this manner, research aiming to relate perceived fluency to measurable speech phenomena runs the risk of circularity. In our experiment, we explicitly manipulated the instructions provided to raters in order to answer three research questions: a) To what extent are listeners capable of rating breakdown fluency, speed fluency, and repair fluency separately? b) Which acoustic correlates contribute to each type of fluency rating? c) Which acoustic correlates contribute to ratings of overall fluency? Four groups of non-expert raters (n = 20 in each group) assessed the same set of L2 Dutch speech materials. One group received instructions to rate overall fluency as the sum of silent and filled pauses (the acoustic correlates of breakdown fluency), speech rate (the acoustic correlates of speed fluency), and corrections and hesitations (the acoustic correlates of repair fluency). Each of the other groups was instructed to attend to only one type of these acoustic correlates (i.e. to pauses, to speech rate, or to corrections and hesitations). The various fluency ratings are related to each other and to the objective acoustic measurements of the speech stimuli. The findings of this correlation study will be relevant for fluency perception studies, and for (second) language testing in general.
  • Bosker, H. R. (2010). Verlossen of flossen: Tracking how learning from talker-specific episodes helps listeners recognize reduced speech. Talk presented at the TWIST conference. Leiden, The Netherlands.
  • Mitterer, H., McQueen, J. M., Bosker, H. R., & Poellmann, K. (2010). Adapting to phonological reduction: Tracking how learning from talker-specific episodes helps listeners recognize reductions. Talk presented at the 5th annual meeting of the Schwerpunktprogramm (SPP) 1234/2: Phonological and phonetic competence: between grammar, signal processing, and neural activity. München, Germany.

Share this page