Hans Rutger Bosker

Publications

Displaying 1 - 37 of 37
  • Kaufeld, G., Naumann, W., Meyer, A. S., Bosker, H. R., & Martin, A. E. (in press). Contextual speech rate influences morphosyntactic prediction and integration. Language, Cognition and Neuroscience.
  • Rodd, J., Bosker, H. R., Ernestus, M., Alday, P. M., Meyer, A. S., & Ten Bosch, L. (in press). Control of speaking rate is achieved by switching between qualitatively distinct cognitive ‘gaits’: Evidence from simulation. Psychological Review.
  • Bosker, H. R., Van Os, M., Does, R., & Van Bergen, G. (2019). Counting 'uhm's: how tracking the distribution of native and non-native disfluencies influences online language comprehension. Journal of Memory and Language, 106, 189-202. doi:10.1016/j.jml.2019.02.006.

    Abstract

    Disfluencies, like 'uh', have been shown to help listeners anticipate reference to low-frequency words. The associative account of this 'disfluency bias' proposes that listeners learn to associate disfluency with low-frequency referents based on prior exposure to non-arbitrary disfluency distributions (i.e., greater probability of low-frequency words after disfluencies). However, there is limited evidence for listeners actually tracking disfluency distributions online. The present experiments are the first to show that adult listeners, exposed to a typical or more atypical disfluency distribution (i.e., hearing a talker unexpectedly say uh before high-frequency words), flexibly adjust their predictive strategies to the disfluency distribution at hand (e.g., learn to predict high-frequency referents after disfluency). However, when listeners were presented with the same atypical disfluency distribution but produced by a non-native speaker, no adjustment was observed. This suggests pragmatic inferences can modulate distributional learning, revealing the flexibility of, and constraints on, distributional learning in incremental language comprehension.
  • Bosker, H. R., Sjerps, M. J., & Reinisch, E. (2019). Spectral contrast effects are modulated by selective attention in ‘cocktail party’ settings. Attention, Perception & Psychophysics. Advance online publication. doi:10.3758/s13414-019-01824-2.

    Abstract

    Speech sounds are perceived relative to spectral properties of surrounding speech. For instance, target words ambiguous between /bɪt/ (with low F1) and /bɛt/ (with high F1) are more likely to be perceived as “bet” after a ‘low F1’ sentence, but as “bit” after a ‘high F1’ sentence. However, it is unclear how these spectral contrast effects (SCEs) operate in multi-talker listening conditions. Recently, Feng and Oxenham [(2018b). J.Exp.Psychol.-Hum.Percept.Perform. 44(9), 1447–1457] reported that selective attention affected SCEs to a small degree, using two simultaneously presented sentences produced by a single talker. The present study assessed the role of selective attention in more naturalistic ‘cocktail party’ settings, with 200 lexically unique sentences, 20 target words, and different talkers. Results indicate that selective attention to one talker in one ear (while ignoring another talker in the other ear) modulates SCEs in such a way that only the spectral properties of the attended talker influences target perception. However, SCEs were much smaller in multi-talker settings (Experiment 2) than those in single-talker settings (Experiment 1). Therefore, the influence of SCEs on speech comprehension in more naturalistic settings (i.e., with competing talkers) may be smaller than estimated based on studies without competing talkers.

    Supplementary material

    13414_2019_1824_MOESM1_ESM.docx
  • Kaufeld, G., Ravenschlag, A., Meyer, A. S., Martin, A. E., & Bosker, H. R. (2019). Knowledge-based and signal-based cues are weighted flexibly during spoken language comprehension. Journal of Experimental Psychology: Learning, Memory, and Cognition. Advance online publication. doi:10.1037/xlm0000744.

    Abstract

    During spoken language comprehension, listeners make use of both knowledge-based and signal-based sources of information, but little is known about how cues from these distinct levels of representational hierarchy are weighted and integrated online. In an eye-tracking experiment using the visual world paradigm, we investigated the flexible weighting and integration of morphosyntactic gender marking (a knowledge-based cue) and contextual speech rate (a signal-based cue). We observed that participants used the morphosyntactic cue immediately to make predictions about upcoming referents, even in the presence of uncertainty about the cue’s reliability. Moreover, we found speech rate normalization effects in participants’ gaze patterns even in the presence of preceding morphosyntactic information. These results demonstrate that cues are weighted and integrated flexibly online, rather than adhering to a strict hierarchy. We further found rate normalization effects in the looking behavior of participants who showed a strong behavioral preference for the morphosyntactic gender cue. This indicates that rate normalization effects are robust and potentially automatic. We discuss these results in light of theories of cue integration and the two-stage model of acoustic context effects
  • Maslowski, M., Meyer, A. S., & Bosker, H. R. (2019). How the tracking of habitual rate influences speech perception. Journal of Experimental Psychology: Learning, Memory, and Cognition, 45(1), 128-138. doi:10.1037/xlm0000579.

    Abstract

    Listeners are known to track statistical regularities in speech. Yet, which temporal cues are encoded is unclear. This study tested effects of talker-specific habitual speech rate and talker-independent average speech rate (heard over a longer period of time) on the perception of the temporal Dutch vowel contrast /A/-/a:/. First, Experiment 1 replicated that slow local (surrounding) speech contexts induce fewer long /a:/ responses than faster contexts. Experiment 2 tested effects of long-term habitual speech rate. One high-rate group listened to ambiguous vowels embedded in `neutral' speech from talker A, intermixed with speech from fast talker B. Another low-rate group listened to the same `neutral' speech from talker A, but to talker B being slow. Between-group comparison of the `neutral' trials showed that the high-rate group demonstrated a lower proportion of /a:/ responses, indicating that talker A's habitual speech rate sounded slower when B was faster. In Experiment 3, both talkers produced speech at both rates, removing the different habitual speech rates of talker A and B, while maintaining the average rate differing between groups. This time no global rate effect was observed. Taken together, the present experiments show that a talker's habitual rate is encoded relative to the habitual rate of another talker, carrying implications for episodic and constraint-based models of speech perception.
  • Maslowski, M., Meyer, A. S., & Bosker, H. R. (2019). Listeners normalize speech for contextual speech rate even without an explicit recognition task. The Journal of the Acoustical Society of America, 146(1), 179-188. doi:10.1121/1.5116004.

    Abstract

    Speech can be produced at different rates. Listeners take this rate variation into account by normalizing vowel duration for contextual speech rate: An ambiguous Dutch word /m?t/ is perceived as short /mAt/ when embedded in a slow context, but long /ma:t/ in a fast context. Whilst some have argued that this rate normalization involves low-level automatic perceptual processing, there is also evidence that it arises at higher-level cognitive processing stages, such as decision making. Prior research on rate-dependent speech perception has only used explicit recognition tasks to investigate the phenomenon, involving both perceptual processing and decision making. This study tested whether speech rate normalization can be observed without explicit decision making, using a cross-modal repetition priming paradigm. Results show that a fast precursor sentence makes an embedded ambiguous prime (/m?t/) sound (implicitly) more /a:/-like, facilitating lexical access to the long target word "maat" in a (explicit) lexical decision task. This result suggests that rate normalization is automatic, taking place even in the absence of an explicit recognition task. Thus, rate normalization is placed within the realm of everyday spoken conversation, where explicit categorization of ambiguous sounds is rare.
  • Rodd, J., Bosker, H. R., Ten Bosch, L., & Ernestus, M. (2019). Deriving the onset and offset times of planning units from acoustic and articulatory measurements. The Journal of the Acoustical Society of America, 145(2), EL161-EL167. doi:10.1121/1.5089456.

    Abstract

    Many psycholinguistic models of speech sequence planning make claims about the onset and offset times of planning units, such as words, syllables, and phonemes. These predictions typically go untested, however, since psycholinguists have assumed that the temporal dynamics of the speech signal is a poor index of the temporal dynamics of the underlying speech planning process. This article argues that this problem is tractable, and presents and validates two simple metrics that derive planning unit onset and offset times from the acoustic signal and articulatographic data.
  • Bosker, H. R., & Ghitza, O. (2018). Entrained theta oscillations guide perception of subsequent speech: Behavioral evidence from rate normalization. Language, Cognition and Neuroscience, 33(8), 955-967. doi:10.1080/23273798.2018.1439179.

    Abstract

    This psychoacoustic study provides behavioral evidence that neural entrainment in the theta range (3-9 Hz) causally shapes speech perception. Adopting the ‘rate normalization’ paradigm (presenting compressed carrier sentences followed by uncompressed target words), we show that uniform compression of a speech carrier to syllable rates inside the theta range influences perception of subsequent uncompressed targets, but compression outside theta range does not. However, the influence of carriers – compressed outside theta range – on target perception is salvaged when carriers are ‘repackaged’ to have a packet rate inside theta. This suggests that the brain can only successfully entrain to syllable/packet rates within theta range, with a causal influence on the perception of subsequent speech, in line with recent neuroimaging data. Thus, this study points to a central role for sustained theta entrainment in rate normalization and contributes to our understanding of the functional role of brain oscillations in speech perception.
  • Bosker, H. R., & Cooke, M. (2018). Talkers produce more pronounced amplitude modulations when speaking in noise. The Journal of the Acoustical Society of America, 143(2), EL121-EL126. doi:10.1121/1.5024404.

    Abstract

    Speakers adjust their voice when talking in noise (known as Lombard speech), facilitating speech comprehension. Recent neurobiological models of speech perception emphasize the role of amplitude modulations in speech-in-noise comprehension, helping neural oscillators to ‘track’ the attended speech. This study tested whether talkers produce more pronounced amplitude modulations in noise. Across four different corpora, modulation spectra showed greater power in amplitude modulations below 4 Hz in Lombard speech compared to matching plain speech. This suggests that noise-induced speech contains more pronounced amplitude modulations, potentially helping the listening brain to entrain to the attended talker, aiding comprehension.
  • Bosker, H. R. (2018). Putting Laurel and Yanny in context. The Journal of the Acoustical Society of America, 144(6), EL503-EL508. doi:10.1121/1.5070144.

    Abstract

    Recently, the world’s attention was caught by an audio clip that was perceived as “Laurel” or “Yanny”. Opinions were sharply split: many could not believe others heard something different from their perception. However, a crowd-source experiment with >500 participants shows that it is possible to make people hear Laurel, where they previously heard Yanny, by manipulating preceding acoustic context. This study is not only the first to reveal within-listener variation in Laurel/Yanny percepts, but also to demonstrate contrast effects for global spectral information in larger frequency regions. Thus, it highlights the intricacies of human perception underlying these social media phenomena.
  • Kösem, A., Bosker, H. R., Takashima, A., Meyer, A. S., Jensen, O., & Hagoort, P. (2018). Neural entrainment determines the words we hear. Current Biology, 28, 2867-2875. doi:10.1016/j.cub.2018.07.023.

    Abstract

    Low-frequency neural entrainment to rhythmic input has been hypothesized as a canonical mechanism that shapes sensory perception in time. Neural entrainment is deemed particularly relevant for speech analysis, as it would contribute to the extraction of discrete linguistic elements from continuous acoustic signals. However, its causal influence in speech perception has been difficult to establish. Here, we provide evidence that oscillations build temporal predictions about the duration of speech tokens that affect perception. Using magnetoencephalography (MEG), we studied neural dynamics during listening to sentences that changed in speech rate. Weobserved neural entrainment to preceding speech rhythms persisting for several cycles after the change in rate. The sustained entrainment was associated with changes in the perceived duration of the last word’s vowel, resulting in the perception of words with different meanings. These findings support oscillatory models of speech processing, suggesting that neural oscillations actively shape speech perception.
  • Maslowski, M., Meyer, A. S., & Bosker, H. R. (2018). Listening to yourself is special: Evidence from global speech rate tracking. PLoS One, 13(9): e0203571. doi:10.1371/journal.pone.0203571.

    Abstract

    Listeners are known to use adjacent contextual speech rate in processing temporally ambiguous speech sounds. For instance, an ambiguous vowel between short /A/ and long /a:/ in Dutch sounds relatively long (i.e., as /a:/) embedded in a fast precursor sentence, but short in a slow sentence. Besides the local speech rate, listeners also track talker-specific global speech rates. However, it is yet unclear whether other talkers' global rates are encoded with reference to a listener's self-produced rate. Three experiments addressed this question. In Experiment 1, one group of participants was instructed to speak fast, whereas another group had to speak slowly. The groups were compared on their perception of ambiguous /A/-/a:/ vowels embedded in neutral rate speech from another talker. In Experiment 2, the same participants listened to playback of their own speech and again evaluated target vowels in neutral rate speech. Neither of these experiments provided support for the involvement of self-produced speech in perception of another talker's speech rate. Experiment 3 repeated Experiment 2 but with a new participant sample that was unfamiliar with the participants from Experiment 2. This experiment revealed fewer /a:/ responses in neutral speech in the group also listening to a fast rate, suggesting that neutral speech sounds slow in the presence of a fast talker and vice versa. Taken together, the findings show that self-produced speech is processed differently from speech produced by others. They carry implications for our understanding of the perceptual and cognitive mechanisms involved in rate-dependent speech perception in dialogue settings.
  • Van Bergen, G., & Bosker, H. R. (2018). Linguistic expectation management in online discourse processing: An investigation of Dutch inderdaad 'indeed' and eigenlijk 'actually'. Journal of Memory and Language, 103, 191-209. doi:10.1016/j.jml.2018.08.004.

    Abstract

    Interpersonal discourse particles (DPs), such as Dutch inderdaad (≈‘indeed’) and eigenlijk (≈‘actually’) are highly frequent in everyday conversational interaction. Despite extensive theoretical descriptions of their polyfunctionality, little is known about how they are used by language comprehenders. In two visual world eye-tracking experiments involving an online dialogue completion task, we asked to what extent inderdaad, confirming an inferred expectation, and eigenlijk, contrasting with an inferred expectation, influence real-time understanding of dialogues. Answers in the dialogues contained a DP or a control adverb, and a critical discourse referent was replaced by a beep; participants chose the most likely dialogue completion by clicking on one of four referents in a display. Results show that listeners make rapid and fine-grained situation-specific inferences about the use of DPs, modulating their expectations about how the dialogue will unfold. Findings further specify and constrain theories about the conversation-managing function and polyfunctionality of DPs.
  • Bosker, H. R. (2017). Accounting for rate-dependent category boundary shifts in speech perception. Attention, Perception & Psychophysics, 79, 333-343. doi:10.3758/s13414-016-1206-4.

    Abstract

    The perception of temporal contrasts in speech is known to be influenced by the speech rate in the surrounding context. This rate-dependent perception is suggested to involve general auditory processes since it is also elicited by non-speech contexts, such as pure tone sequences. Two general auditory mechanisms have been proposed to underlie rate-dependent perception: durational contrast and neural entrainment. The present study compares the predictions of these two accounts of rate-dependent speech perception by means of four experiments in which participants heard tone sequences followed by Dutch target words ambiguous between /ɑs/ “ash” and /a:s/ “bait”. Tone sequences varied in the duration of tones (short vs. long) and in the presentation rate of the tones (fast vs. slow). Results show that the duration of preceding tones did not influence target perception in any of the experiments, thus challenging durational contrast as explanatory mechanism behind rate-dependent perception. Instead, the presentation rate consistently elicited a category boundary shift, with faster presentation rates inducing more /a:s/ responses, but only if the tone sequence was isochronous. Therefore, this study proposes an alternative, neurobiologically plausible, account of rate-dependent perception involving neural entrainment of endogenous oscillations to the rate of a rhythmic stimulus.
  • Bosker, H. R., & Kösem, A. (2017). An entrained rhythm's frequency, not phase, influences temporal sampling of speech. In Proceedings of Interspeech 2017 (pp. 2416-2420). doi:10.21437/Interspeech.2017-73.

    Abstract

    Brain oscillations have been shown to track the slow amplitude fluctuations in speech during comprehension. Moreover, there is evidence that these stimulus-induced cortical rhythms may persist even after the driving stimulus has ceased. However, how exactly this neural entrainment shapes speech perception remains debated. This behavioral study investigated whether and how the frequency and phase of an entrained rhythm would influence the temporal sampling of subsequent speech. In two behavioral experiments, participants were presented with slow and fast isochronous tone sequences, followed by Dutch target words ambiguous between as /ɑs/ “ash” (with a short vowel) and aas /a:s/ “bait” (with a long vowel). Target words were presented at various phases of the entrained rhythm. Both experiments revealed effects of the frequency of the tone sequence on target word perception: fast sequences biased listeners to more long /a:s/ responses. However, no evidence for phase effects could be discerned. These findings show that an entrained rhythm’s frequency, but not phase, influences the temporal sampling of subsequent speech. These outcomes are compatible with theories suggesting that sensory timing is evaluated relative to entrained frequency. Furthermore, they suggest that phase tracking of (syllabic) rhythms by theta oscillations plays a limited role in speech parsing.
  • Bosker, H. R., & Reinisch, E. (2017). Foreign languages sound fast: evidence from implicit rate normalization. Frontiers in Psychology, 8: 1063. doi:10.3389/fpsyg.2017.01063.

    Abstract

    Anecdotal evidence suggests that unfamiliar languages sound faster than one’s native language. Empirical evidence for this impression has, so far, come from explicit rate judgments. The aim of the present study was to test whether such perceived rate differences between native and foreign languages have effects on implicit speech processing. Our measure of implicit rate perception was “normalization for speaking rate”: an ambiguous vowel between short /a/ and long /a:/ is interpreted as /a:/ following a fast but as /a/ following a slow carrier sentence. That is, listeners did not judge speech rate itself; instead, they categorized ambiguous vowels whose perception was implicitly affected by the rate of the context. We asked whether a bias towards long /a:/ might be observed when the context is not actually faster but simply spoken in a foreign language. A fully symmetrical experimental design was used: Dutch and German participants listened to rate matched (fast and slow) sentences in both languages spoken by the same bilingual speaker. Sentences were followed by nonwords that contained vowels from an /a-a:/ duration continuum. Results from Experiments 1 and 2 showed a consistent effect of rate normalization for both listener groups. Moreover, for German listeners, across the two experiments, foreign sentences triggered more /a:/ responses than (rate matched) native sentences, suggesting that foreign sentences were indeed perceived as faster. Moreover, this Foreign Language effect was modulated by participants’ ability to understand the foreign language: those participants that scored higher on a foreign language translation task showed less of a Foreign Language effect. However, opposite effects were found for the Dutch listeners. For them, their native rather than the foreign language induced more /a:/ responses. Nevertheless, this reversed effect could be reduced when additional spectral properties of the context were controlled for. Experiment 3, using explicit rate judgments, replicated the effect for German but not Dutch listeners. We therefore conclude that the subjective impression that foreign languages sound fast may have an effect on implicit speech processing, with implications for how language learners perceive spoken segments in a foreign language.

    Supplementary material

    data sheet 1.docx
  • Bosker, H. R., Reinisch, E., & Sjerps, M. J. (2017). Cognitive load makes speech sound fast, but does not modulate acoustic context effects. Journal of Memory and Language, 94, 166-176. doi:10.1016/j.jml.2016.12.002.

    Abstract

    In natural situations, speech perception often takes place during the concurrent execution of other cognitive tasks, such as listening while viewing a visual scene. The execution of a dual task typically has detrimental effects on concurrent speech perception, but how exactly cognitive load disrupts speech encoding is still unclear. The detrimental effect on speech representations may consist of either a general reduction in the robustness of processing of the speech signal (‘noisy encoding’), or, alternatively it may specifically influence the temporal sampling of the sensory input, with listeners missing temporal pulses, thus underestimating segmental durations (‘shrinking of time’). The present study investigated whether and how spectral and temporal cues in a precursor sentence that has been processed under high vs. low cognitive load influence the perception of a subsequent target word. If cognitive load effects are implemented through ‘noisy encoding’, increasing cognitive load during the precursor should attenuate the encoding of both its temporal and spectral cues, and hence reduce the contextual effect that these cues can have on subsequent target sound perception. However, if cognitive load effects are expressed as ‘shrinking of time’, context effects should not be modulated by load, but a main effect would be expected on the perceived duration of the speech signal. Results from two experiments indicate that increasing cognitive load (manipulated through a secondary visual search task) did not modulate temporal (Experiment 1) or spectral context effects (Experiment 2). However, a consistent main effect of cognitive load was found: increasing cognitive load during the precursor induced a perceptual increase in its perceived speech rate, biasing the perception of a following target word towards longer durations. This finding suggests that cognitive load effects in speech perception are implemented via ‘shrinking of time’, in line with a temporal sampling framework. In addition, we argue that our results align with a model in which early (spectral and temporal) normalization is unaffected by attention but later adjustments may be attention-dependent.
  • Bosker, H. R. (2017). How our own speech rate influences our perception of others. Journal of Experimental Psychology: Learning, Memory, and Cognition, 43(8), 1225-1238. doi:10.1037/xlm0000381.

    Abstract

    In conversation, our own speech and that of others follow each other in rapid succession. Effects of the surrounding context on speech perception are well documented but, despite the ubiquity of the sound of our own voice, it is unknown whether our own speech also influences our perception of other talkers. This study investigated context effects induced by our own speech through six experiments, specifically targeting rate normalization (i.e., perceiving phonetic segments relative to surrounding speech rate). Experiment 1 revealed that hearing pre-recorded fast or slow context sentences altered the perception of ambiguous vowels, replicating earlier work. Experiment 2 demonstrated that talking at a fast or slow rate prior to target presentation also altered target perception, though the effect of preceding speech rate was reduced. Experiment 3 showed that silent talking (i.e., inner speech) at fast or slow rates did not modulate the perception of others, suggesting that the effect of self-produced speech rate in Experiment 2 arose through monitoring of the external speech signal. Experiment 4 demonstrated that, when participants were played back their own (fast/slow) speech, no reduction of the effect of preceding speech rate was observed, suggesting that the additional task of speech production may be responsible for the reduced effect in Experiment 2. Finally, Experiments 5 and 6 replicate Experiments 2 and 3 with new participant samples. Taken together, these results suggest that variation in speech production may induce variation in speech perception, thus carrying implications for our understanding of spoken communication in dialogue settings.
  • Bosker, H. R. (2017). The role of temporal amplitude modulations in the political arena: Hillary Clinton vs. Donald Trump. In Proceedings of Interspeech 2017 (pp. 2228-2232). doi:10.21437/Interspeech.2017-142.

    Abstract

    Speech is an acoustic signal with inherent amplitude modulations in the 1-9 Hz range. Recent models of speech perception propose that this rhythmic nature of speech is central to speech recognition. Moreover, rhythmic amplitude modulations have been shown to have beneficial effects on language processing and the subjective impression listeners have of the speaker. This study investigated the role of amplitude modulations in the political arena by comparing the speech produced by Hillary Clinton and Donald Trump in the three presidential debates of 2016. Inspection of the modulation spectra, revealing the spectral content of the two speakers’ amplitude envelopes after matching for overall intensity, showed considerably greater power in Clinton’s modulation spectra (compared to Trump’s) across the three debates, particularly in the 1-9 Hz range. The findings suggest that Clinton’s speech had a more pronounced temporal envelope with rhythmic amplitude modulations below 9 Hz, with a preference for modulations around 3 Hz. This may be taken as evidence for a more structured temporal organization of syllables in Clinton’s speech, potentially due to more frequent use of preplanned utterances. Outcomes are interpreted in light of the potential beneficial effects of a rhythmic temporal envelope on intelligibility and speaker perception.
  • Maslowski, M., Meyer, A. S., & Bosker, H. R. (2017). Whether long-term tracking of speech rate affects perception depends on who is talking. In Proceedings of Interspeech 2017 (pp. 586-590). doi:10.21437/Interspeech.2017-1517.

    Abstract

    Speech rate is known to modulate perception of temporally ambiguous speech sounds. For instance, a vowel may be perceived as short when the immediate speech context is slow, but as long when the context is fast. Yet, effects of long-term tracking of speech rate are largely unexplored. Two experiments tested whether long-term tracking of rate influences perception of the temporal Dutch vowel contrast /ɑ/-/a:/. In Experiment 1, one low-rate group listened to 'neutral' rate speech from talker A and to slow speech from talker B. Another high-rate group was exposed to the same neutral speech from A, but to fast speech from B. Between-group comparison of the 'neutral' trials revealed that the low-rate group reported a higher proportion of /a:/ in A's 'neutral' speech, indicating that A sounded faster when B was slow. Experiment 2 tested whether one's own speech rate also contributes to effects of long-term tracking of rate. Here, talker B's speech was replaced by playback of participants' own fast or slow speech. No evidence was found that one's own voice affected perception of talker A in larger speech contexts. These results carry implications for our understanding of the mechanisms involved in rate-dependent speech perception and of dialogue.
  • Bosker, H. R. (2016). Our own speech rate influences speech perception. In J. Barnes, A. Brugos, S. Stattuck-Hufnagel, & N. Veilleux (Eds.), Proceedings of Speech Prosody 2016 (pp. 227-231).

    Abstract

    During conversation, spoken utterances occur in rich acoustic contexts, including speech produced by our interlocutor(s) and speech we produced ourselves. Prosodic characteristics of the acoustic context have been known to influence speech perception in a contrastive fashion: for instance, a vowel presented in a fast context is perceived to have a longer duration than the same vowel in a slow context. Given the ubiquity of the sound of our own voice, it may be that our own speech rate - a common source of acoustic context - also influences our perception of the speech of others. Two experiments were designed to test this hypothesis. Experiment 1 replicated earlier contextual rate effects by showing that hearing pre-recorded fast or slow context sentences alters the perception of ambiguous Dutch target words. Experiment 2 then extended this finding by showing that talking at a fast or slow rate prior to the presentation of the target words also altered the perception of those words. These results suggest that between-talker variation in speech rate production may induce between-talker variation in speech perception, thus potentially explaining why interlocutors tend to converge on speech rate in dialogue settings.

    Supplementary material

    pdf via conference website227
  • Bosker, H. R., Reinisch, E., & Sjerps, M. J. (2016). Listening under cognitive load makes speech sound fast. In H. van den Heuvel, B. Cranen, & S. Mattys (Eds.), Proceedings of the Speech Processing in Realistic Environments [SPIRE] Workshop (pp. 23-24). Groningen.
  • Gibson, M., & Bosker, H. R. (2016). Over vloeiendheid in spraak. Tijdschrift Taal, 7(10), 40-45.
  • Bosker, H. R., Tjiong, V., Quené, H., Sanders, T., & De Jong, N. H. (2015). Both native and non-native disfluencies trigger listeners' attention. In Disfluency in Spontaneous Speech: DISS 2015: An ICPhS Satellite Meeting. Edinburgh: DISS2015.

    Abstract

    Disfluencies, such as uh and uhm, are known to help the listener in speech comprehension. For instance, disfluencies may elicit prediction of less accessible referents and may trigger listeners’ attention to the following word. However, recent work suggests differential processing of disfluencies in native and non-native speech. The current study investigated whether the beneficial effects of disfluencies on listeners’ attention are modulated by the (non-)native identity of the speaker. Using the Change Detection Paradigm, we investigated listeners’ recall accuracy for words presented in disfluent and fluent contexts, in native and non-native speech. We observed beneficial effects of both native and non-native disfluencies on listeners’ recall accuracy, suggesting that native and non-native disfluencies trigger listeners’ attention in a similar fashion.
  • Bosker, H. R., & Reinisch, E. (2015). Normalization for speechrate in native and nonnative speech. In M. Wolters, J. Livingstone, B. Beattie, R. Smith, M. MacMahon, J. Stuart-Smith, & J. Scobbie (Eds.), Proceedings of the 18th International Congresses of Phonetic Sciences (ICPhS 2015). London: International Phonetic Association.

    Abstract

    Speech perception involves a number of processes that deal with variation in the speech signal. One such process is normalization for speechrate: local temporal cues are perceived relative to the rate in the surrounding context. It is as yet unclear whether and how this perceptual effect interacts with higher level impressions of rate, such as a speaker’s nonnative identity. Nonnative speakers typically speak more slowly than natives, an experience that listeners take into account when explicitly judging the rate of nonnative speech. The present study investigated whether this is also reflected in implicit rate normalization. Results indicate that nonnative speech is implicitly perceived as faster than temporally-matched native speech, suggesting that the additional cognitive load of listening to an accent speeds up rate perception. Therefore, rate perception in speech is not dependent on syllable durations alone but also on the ease of processing of the temporal signal.
  • Bosker, H. R., Quené, H., Sanders, T. J. M., & de Jong, N. H. (2014). Native 'um's elicit prediction of low-frequency referents, but non-native 'um's do not. Journal of Memory and Language, 75, 104-116. doi:10.1016/j.jml.2014.05.004.

    Abstract

    Speech comprehension involves extensive use of prediction. Linguistic prediction may be guided by the semantics or syntax, but also by the performance characteristics of the speech signal, such as disfluency. Previous studies have shown that listeners, when presented with the filler uh, exhibit a disfluency bias for discourse-new or unknown referents, drawing inferences about the source of the disfluency. The goal of the present study is to study the contrast between native and non-native disfluencies in speech comprehension. Experiment 1 presented listeners with pictures of high-frequency (e.g., a hand) and low-frequency objects (e.g., a sewing machine) and with fluent and disfluent instructions. Listeners were found to anticipate reference to low-frequency objects when encountering disfluency, thus attributing disfluency to speaker trouble in lexical retrieval. Experiment 2 showed that, when participants listened to disfluent non-native speech, no anticipation of low-frequency referents was observed. We conclude that listeners can adapt their predictive strategies to the (non-native) speaker at hand, extending our understanding of the role of speaker identity in speech comprehension.
  • Bosker, H. R., Quené, H., Sanders, T. J. M., & de Jong, N. H. (2014). The perception of fluency in native and non-native speech. Language Learning, 64, 579-614. doi:10.1111/lang.12067.

    Abstract

    Where native speakers supposedly are fluent by default, non-native speakers often have to strive hard to achieve a native-like fluency level. However, disfluencies (such as pauses, fillers, repairs, etc.) occur in both native and non-native speech and it is as yet unclear ow luency raters weigh the fluency characteristics of native and non-native speech. Two rating experiments compared the way raters assess the luency of native and non-native speech. The fluency characteristics of native and non- native speech were controlled by using phonetic anipulations in pause (Experiment 1) and speed characteristics (Experiment 2). The results show that the ratings on manipulated native and on-native speech were affected in a similar fashion. This suggests that there is no difference in the way listeners weigh the fluency haracteristics of native and non-native speakers.
  • Bosker, H. R.(2014). The processing and evaluation of fluency in native and non-native speech. Research Note for Pearson Language Testing.
  • Bosker, H. R. (2014). The processing and evaluation of fluency in native and non-native speech. PhD Thesis, Utrecht University, Utrecht.

    Abstract

    Disfluency is a common characteristic of spontaneously produced speech. Disfluencies (e.g., silent pauses, filled pauses [uh’s and uhm’s], corrections, repetitions, etc.) occur in both native and non-native speech. There appears to be an apparent contradiction between claims from the evaluative and cognitive approach to fluency. On the one hand, the evaluative approach shows that non-native disfluencies have a negative effect on listeners’ subjective fluency impressions. On the other hand, the cognitive approach reports beneficial effects of native disfluencies on cognitive processes involved in speech comprehension, such as prediction and attention. This dissertation aims to resolve this apparent contradiction by combining the evaluative and cognitive approach. The reported studies target both the evaluation (Chapters 2 and 3) and the processing of fluency (Chapters 4 and 5) in native and non-native speech. Thus, it provides an integrative account of native and non-native fluency perception, informative to both language testing practice and cognitive psycholinguists. The proposed account of fluency perception testifies to the notion that speech performance matters: communication through spoken language does not only depend on what is said, but also on how it is said and by whom.
  • Pinget, A.-F., Bosker, H. R., Quené, H., & de Jong, N. H. (2014). Native speakers' perceptions of fluency and accent in L2 speech. Language Testing, 31, 349-365. doi:10.1177/0265532214526177.

    Abstract

    Oral fluency and foreign accent distinguish L2 from L1 speech production. In language testing practices, both fluency and accent are usually assessed by raters. This study investigates what exactly native raters of fluency and accent take into account when judging L2. Our aim is to explore the relationship between objectively measured temporal, segmental and suprasegmental properties of speech on the one hand, and fluency and accent as rated by native raters on the other hand. For 90 speech fragments from Turkish and English L2 learners of Dutch, several acoustic measures of fluency and accent were calculated. In Experiment 1, 20 native speakers of Dutch rated the L2 Dutch samples on fluency. In Experiment 2, 20 different untrained native speakers of Dutch judged the L2 Dutch samples on accentedness. Regression analyses revealed that acoustic measures of fluency were good predictors of fluency ratings. Secondly, segmental and suprasegmental measures of accent could predict some variance of accent ratings. Thirdly, perceived fluency and perceived accent were only weakly related. In conclusion, this study shows that fluency and perceived foreign accent can be judged as separate constructs.
  • Poellmann, K., Bosker, H. R., McQueen, J. M., & Mitterer, H. (2014). Perceptual adaptation to segmental and syllabic reductions in continuous spoken Dutch. Journal of Phonetics, 46, 101-127. doi:10.1016/j.wocn.2014.06.004.

    Abstract

    This study investigates if and how listeners adapt to reductions in casual continuous speech. In a perceptual-learning variant of the visual-world paradigm, two groups of Dutch participants were exposed to either segmental (/b/ → [ʋ]) or syllabic (ver- → [fː]) reductions in spoken Dutch sentences. In the test phase, both groups heard both kinds of reductions, but now applied to different words. In one of two experiments, the segmental reduction exposure group was better than the syllabic reduction exposure group in recognizing new reduced /b/-words. In both experiments, the syllabic reduction group showed a greater target preference for new reduced ver-words. Learning about reductions was thus applied to previously unheard words. This lexical generalization suggests that mechanisms compensating for segmental and syllabic reductions take place at a prelexical level, and hence that lexical access involves an abstractionist mode of processing. Existing abstractionist models need to be revised, however, as they do not include representations of sequences of segments (corresponding e.g. to ver-) at the prelexical level.
  • Bosker, H. R. (2013). Juncture (prosodic). In G. Khan (Ed.), Encyclopedia of Hebrew Language and Linguistics (pp. 432-434). Leiden: Brill.

    Abstract

    Prosodic juncture concerns the compartmentalization and partitioning of syntactic entities in spoken discourse by means of prosody. It has been argued that the Intonation Unit, defined by internal criteria and prosodic boundary phenomena (e.g., final lengthening, pitch reset, pauses), encapsulates the basic structural unit of spoken Modern Hebrew.
  • Bosker, H. R., Pinget, A.-F., Quené, H., Sanders, T., & De Jong, N. H. (2013). What makes speech sound fluent? The contributions of pauses, speed and repairs. Language testing, 30(2), 159-175. doi:10.1177/0265532212455394.

    Abstract

    The oral fluency level of an L2 speaker is often used as a measure in assessing language proficiency. The present study reports on four experiments investigating the contributions of three fluency aspects (pauses, speed and repairs) to perceived fluency. In Experiment 1 untrained raters evaluated the oral fluency of L2 Dutch speakers. Using specific acoustic measures of pause, speed and repair phenomena, linear regression analyses revealed that pause and speed measures best predicted the subjective fluency ratings, and that repair measures contributed only very little. A second research question sought to account for these results by investigating perceptual sensitivity to acoustic pause, speed and repair phenomena, possibly accounting for the results from Experiment 1. In Experiments 2–4 three new groups of untrained raters rated the same L2 speech materials from Experiment 1 on the use of pauses, speed and repairs. A comparison of the results from perceptual sensitivity (Experiments 2–4) with fluency perception (Experiment 1) showed that perceptual sensitivity alone could not account for the contributions of the three aspects to perceived fluency. We conclude that listeners weigh the importance of the perceived aspects of fluency to come to an overall judgment.
  • Bosker, H. R. (2013). Sibilant consonants. In G. Khan (Ed.), Encyclopedia of Hebrew Language and Linguistics (pp. 557-561). Leiden: Brill.

    Abstract

    Fricative consonants in Hebrew can be divided into bgdkpt and sibilants (ז, ס, צ, שׁ, שׂ). Hebrew sibilants have been argued to stem from Proto-Semitic affricates, laterals, interdentals and /s/. In standard Israeli Hebrew the sibilants are pronounced as [s] (ס and שׂ), [ʃ] (שׁ), [z] (ז), [ʦ] (צ).
  • De Jong, N. H., & Bosker, H. R. (2013). Choosing a threshold for silent pauses to measure second language fluency. In R. Eklund (Ed.), Proceedings of the 6th Workshop on Disfluency in Spontaneous Speech (DiSS) (pp. 17-20).

    Abstract

    Second language (L2) research often involves analyses of acoustic measures of fluency. The studies investigating fluency, however, have been difficult to compare because the measures of fluency that were used differed widely. One of the differences between studies concerns the lower cut-off point for silent pauses, which has been set anywhere between 100 ms and 1000 ms. The goal of this paper is to find an optimal cut-off point. We calculate acoustic measures of fluency using different pause thresholds and then relate these measures to a measure of L2 proficiency and to ratings on fluency.
  • Bosker, H. R., Briaire, J., Heeren, W., van Heuven, V. J., & Jongman, S. R. (2010). Whispered speech as input for cochlear implants. In J. Van Kampen, & R. Nouwen (Eds.), Linguistics in the Netherlands 2010 (pp. 1-14).

Share this page