Displaying 1 - 8 of 8
-
Wu, M., Bosker, H. R., & Riecke, L. (2023). Sentential contextual facilitation of auditory word processing builds up during sentence tracking. Journal of Cognitive Neuroscience, 35(8), 1262 -1278. doi:10.1162/jocn_a_02007.
Abstract
While listening to meaningful speech, auditory input is processed more rapidly near the end (vs. beginning) of sentences. Although several studies have shown such word-to-word changes in auditory input processing, it is still unclear from which processing level these word-to-word dynamics originate. We investigated whether predictions derived from sentential context can result in auditory word-processing dynamics during sentence tracking. We presented healthy human participants with auditory stimuli consisting of word sequences, arranged into either predictable (coherent sentences) or less predictable (unstructured, random word sequences) 42-Hz amplitude-modulated speech, and a continuous 25-Hz amplitude-modulated distractor tone. We recorded RTs and frequency-tagged neuroelectric responses 1(auditory steady-state responses) to individual words at multiple temporal positions within the sentences, and quantified sentential context effects at each position while controlling for individual word characteristics (i.e., phonetics, frequency, and familiarity). We found that sentential context increasingly facilitates auditory word processing as evidenced by accelerated RTs and increased auditory steady-state responses to later-occurring words within sentences. These purely top–down contextually driven auditory word-processing dynamics occurred only when listeners focused their attention on the speech and did not transfer to the auditory processing of the concurrent distractor tone. These findings indicate that auditory word-processing dynamics during sentence tracking can originate from sentential predictions. The predictions depend on the listeners' attention to the speech, and affect only the processing of the parsed speech, not that of concurrently presented auditory streams. -
Severijnen, G. G. A., Bosker, H. R., & McQueen, J. M. (2023). Syllable rate drives rate normalization, but is not the only factor. In R. Skarnitzl, & J. Volín (
Eds. ), Proceedings of the 20th International Congress of the Phonetic Sciences (ICPhS 2023) (pp. 56-60). Prague: Guarant International.Abstract
Speech is perceived relative to the speech rate in the context. It is unclear, however, what information listeners use to compute speech rate. The present study examines whether listeners use the number of
syllables per unit time (i.e., syllable rate) as a measure of speech rate, as indexed by subsequent vowel perception. We ran two rate-normalization experiments in which participants heard duration-matched word lists that contained either monosyllabic
vs. bisyllabic words (Experiment 1), or monosyllabic vs. trisyllabic pseudowords (Experiment 2). The participants’ task was to categorize an /ɑ-aː/ continuum that followed the word lists. The monosyllabic condition was perceived as slower (i.e., fewer /aː/ responses) than the bisyllabic and
trisyllabic condition. However, no difference was observed between bisyllabic and trisyllabic contexts. Therefore, while syllable rate is used in perceiving speech rate, other factors, such as fast speech processes, mean F0, and intensity, must also influence rate normalization. -
Severijnen, G. G. A., Di Dona, G., Bosker, H. R., & McQueen, J. M. (2023). Tracking talker-specific cues to lexical stress: Evidence from perceptual learning. Journal of Experimental Psychology: Human Perception and Performance, 49(4), 549-565. doi:10.1037/xhp0001105.
Abstract
When recognizing spoken words, listeners are confronted by variability in the speech signal caused by talker differences. Previous research has focused on segmental talker variability; less is known about how suprasegmental variability is handled. Here we investigated the use of perceptual learning to deal with between-talker differences in lexical stress. Two groups of participants heard Dutch minimal stress pairs (e.g., VOORnaam vs. voorNAAM, “first name” vs. “respectable”) spoken by two male talkers. Group 1 heard Talker 1 use only F0 to signal stress (intensity and duration values were ambiguous), while Talker 2 used only intensity (F0 and duration were ambiguous). Group 2 heard the reverse talker-cue mappings. After training, participants were tested on words from both talkers containing conflicting stress cues (“mixed items”; e.g., one spoken by Talker 1 with F0 signaling initial stress and intensity signaling final stress). We found that listeners used previously learned information about which talker used which cue to interpret the mixed items. For example, the mixed item described above tended to be interpreted as having initial stress by Group 1 but as having final stress by Group 2. This demonstrates that listeners learn how individual talkers signal stress and use that knowledge in spoken-word recognition.Additional information
XHP-2022-2184_Supplemental_materials_xhp0001105.docx -
Uluşahin, O., Bosker, H. R., McQueen, J. M., & Meyer, A. S. (2023). No evidence for convergence to sub-phonemic F2 shifts in shadowing. In R. Skarnitzl, & J. Volín (
Eds. ), Proceedings of the 20th International Congress of the Phonetic Sciences (ICPhS 2023) (pp. 96-100). Prague: Guarant International.Abstract
Over the course of a conversation, interlocutors sound more and more like each other in a process called convergence. However, the automaticity and grain size of convergence are not well established. This study therefore examined whether female native Dutch speakers converge to large yet sub-phonemic shifts in the F2 of the vowel /e/. Participants first performed a short reading task to establish baseline F2s for the vowel /e/, then shadowed 120 target words (alongside 360 fillers) which contained one instance of a manipulated vowel /e/ where the F2 had been shifted down to that of the vowel /ø/. Consistent exposure to large (sub-phonemic) downward shifts in F2 did not result in convergence. The results raise issues for theories which view convergence as a product of automatic integration between perception and production. -
Bosker, H. R., Van Os, M., Does, R., & Van Bergen, G. (2019). Counting 'uhm's: how tracking the distribution of native and non-native disfluencies influences online language comprehension. Journal of Memory and Language, 106, 189-202. doi:10.1016/j.jml.2019.02.006.
Abstract
Disfluencies, like 'uh', have been shown to help listeners anticipate reference to low-frequency words. The associative account of this 'disfluency bias' proposes that listeners learn to associate disfluency with low-frequency referents based on prior exposure to non-arbitrary disfluency distributions (i.e., greater probability of low-frequency words after disfluencies). However, there is limited evidence for listeners actually tracking disfluency distributions online. The present experiments are the first to show that adult listeners, exposed to a typical or more atypical disfluency distribution (i.e., hearing a talker unexpectedly say uh before high-frequency words), flexibly adjust their predictive strategies to the disfluency distribution at hand (e.g., learn to predict high-frequency referents after disfluency). However, when listeners were presented with the same atypical disfluency distribution but produced by a non-native speaker, no adjustment was observed. This suggests pragmatic inferences can modulate distributional learning, revealing the flexibility of, and constraints on, distributional learning in incremental language comprehension. -
Maslowski, M., Meyer, A. S., & Bosker, H. R. (2019). How the tracking of habitual rate influences speech perception. Journal of Experimental Psychology: Learning, Memory, and Cognition, 45(1), 128-138. doi:10.1037/xlm0000579.
Abstract
Listeners are known to track statistical regularities in speech. Yet, which temporal cues
are encoded is unclear. This study tested effects of talker-specific habitual speech rate
and talker-independent average speech rate (heard over a longer period of time) on
the perception of the temporal Dutch vowel contrast /A/-/a:/. First, Experiment 1
replicated that slow local (surrounding) speech contexts induce fewer long /a:/
responses than faster contexts. Experiment 2 tested effects of long-term habitual
speech rate. One high-rate group listened to ambiguous vowels embedded in `neutral'
speech from talker A, intermixed with speech from fast talker B. Another low-rate group
listened to the same `neutral' speech from talker A, but to talker B being slow.
Between-group comparison of the `neutral' trials showed that the high-rate group
demonstrated a lower proportion of /a:/ responses, indicating that talker A's habitual
speech rate sounded slower when B was faster. In Experiment 3, both talkers
produced speech at both rates, removing the different habitual speech rates of talker A
and B, while maintaining the average rate differing between groups. This time no
global rate effect was observed. Taken together, the present experiments show that a
talker's habitual rate is encoded relative to the habitual rate of another talker, carrying
implications for episodic and constraint-based models of speech perception. -
Maslowski, M., Meyer, A. S., & Bosker, H. R. (2019). Listeners normalize speech for contextual speech rate even without an explicit recognition task. The Journal of the Acoustical Society of America, 146(1), 179-188. doi:10.1121/1.5116004.
Abstract
Speech can be produced at different rates. Listeners take this rate variation into account by normalizing vowel duration for contextual speech rate: An ambiguous Dutch word /m?t/ is perceived as short /mAt/ when embedded in a slow context, but long /ma:t/ in a fast context. Whilst some have argued that this rate normalization involves low-level automatic perceptual processing, there is also evidence that it arises at higher-level cognitive processing stages, such as decision making. Prior research on rate-dependent speech perception has only used explicit recognition tasks to investigate the phenomenon, involving both perceptual processing and decision making. This study tested whether speech rate normalization can be observed without explicit decision making, using a cross-modal repetition priming paradigm. Results show that a fast precursor sentence makes an embedded ambiguous prime (/m?t/) sound (implicitly) more /a:/-like, facilitating lexical access to the long target word "maat" in a (explicit) lexical decision task. This result suggests that rate normalization is automatic, taking place even in the absence of an explicit recognition task. Thus, rate normalization is placed within the realm of everyday spoken conversation, where explicit categorization of ambiguous sounds is rare.Additional information
https://asa.scitation.org/doi/suppl/10.1121/1.5116004 -
Rodd, J., Bosker, H. R., Ten Bosch, L., & Ernestus, M. (2019). Deriving the onset and offset times of planning units from acoustic and articulatory measurements. The Journal of the Acoustical Society of America, 145(2), EL161-EL167. doi:10.1121/1.5089456.
Abstract
Many psycholinguistic models of speech sequence planning make claims about the onset and offset times of planning units, such as words, syllables, and phonemes. These predictions typically go untested, however, since psycholinguists have assumed that the temporal dynamics of the speech signal is a poor index of the temporal dynamics of the underlying speech planning process. This article argues that this problem is tractable, and presents and validates two simple metrics that derive planning unit onset and offset times from the acoustic signal and articulatographic data.
Share this page