Linda Drijvers

Publications

Displaying 1 - 15 of 15
  • Drijvers, L., & Ozyurek, A. (2020). Non-native listeners benefit less from gestures and visible speech than native listeners during degraded speech comprehension. Language and Speech, 63(2), 209-220. doi:10.1177/0023830919831311.

    Abstract

    Native listeners benefit from both visible speech and iconic gestures to enhance degraded speech comprehension (Drijvers & Ozyürek, 2017). We tested how highly proficient non-native listeners benefit from these visual articulators compared to native listeners. We presented videos of an actress uttering a verb in clear, moderately, or severely degraded speech, while her lips were blurred, visible, or visible and accompanied by a gesture. Our results revealed that unlike native listeners, non-native listeners were less likely to benefit from the combined enhancement of visible speech and gestures, especially since the benefit from visible speech was minimal when the signal quality was not sufficient.
  • Drijvers, L., Jensen, O., & Spaak, E. (2020). Rapid invisible frequency tagging reveals nonlinear integration of auditory and visual information. Human Brain Mapping. Advance online publication. doi:10.1002/hbm.25282.

    Abstract

    During communication in real-life settings, the brain integrates information from auditory and visual modalities to form a unified percept of our environment. In the current magnetoencephalography (MEG) study, we used rapid invisible frequency tagging (RIFT) to generate steady-state evoked fields and investigated the integration of audiovisual information in a semantic context. We presented participants with videos of an actress uttering action verbs (auditory; tagged at 61 Hz) accompanied by a gesture (visual; tagged at 68 Hz, using a projector with a 1440 Hz refresh rate). Integration ease was manipulated by auditory factors (clear/degraded speech) and visual factors (congruent/incongruent gesture). We identified MEG spectral peaks at the individual (61/68 Hz) tagging frequencies. We furthermore observed a peak at the intermodulation frequency of the auditory and visually tagged signals (fvisual – fauditory = 7 Hz), specifically when integration was easiest (i.e., when speech was clear and accompanied by a congruent gesture). This intermodulation peak is a signature of nonlinear audiovisual integration, and was strongest in left inferior frontal gyrus and left temporal regions; areas known to be involved in speech-gesture integration. The enhanced power at the intermodulation frequency thus reflects the ease of integration and demonstrates that speech-gesture information interacts in higher-order language areas. Furthermore, we provide a proof-of-principle of the use of RIFT to study the integration of audiovisual stimuli, in relation to, for instance, semantic context.
  • Ripperda, J., Drijvers, L., & Holler, J. (2020). Speeding up the detection of non-iconic and iconic gestures (SPUDNIG): A toolkit for the automatic detection of hand movements and gestures in video data. Behavior Research Methods, 52(4), 1783-1794. doi:10.3758/s13428-020-01350-2.

    Abstract

    In human face-to-face communication, speech is frequently accompanied by visual signals, especially communicative hand gestures. Analyzing these visual signals requires detailed manual annotation of video data, which is often a labor-intensive and time-consuming process. To facilitate this process, we here present SPUDNIG (SPeeding Up the Detection of Non-iconic and Iconic Gestures), a tool to automatize the detection and annotation of hand movements in video data. We provide a detailed description of how SPUDNIG detects hand movement initiation and termination, as well as open-source code and a short tutorial on an easy-to-use graphical user interface (GUI) of our tool. We then provide a proof-of-principle and validation of our method by comparing SPUDNIG’s output to manual annotations of gestures by a human coder. While the tool does not entirely eliminate the need of a human coder (e.g., for false positives detection), our results demonstrate that SPUDNIG can detect both iconic and non-iconic gestures with very high accuracy, and could successfully detect all iconic gestures in our validation dataset. Importantly, SPUDNIG’s output can directly be imported into commonly used annotation tools such as ELAN and ANVIL. We therefore believe that SPUDNIG will be highly relevant for researchers studying multimodal communication due to its annotations significantly accelerating the analysis of large video corpora.

    Additional information

    data and materials
  • Schubotz, L., Holler, J., Drijvers, L., & Ozyurek, A. (2020). Aging and working memory modulate the ability to benefit from visible speech and iconic gestures during speech-in-noise comprehension. Psychological Research. Advance online publication. doi:10.1007/s00426-020-01363-8.

    Abstract

    When comprehending speech-in-noise (SiN), younger and older adults benefit from seeing the speaker’s mouth, i.e. visible speech. Younger adults additionally benefit from manual iconic co-speech gestures. Here, we investigate to what extent younger and older adults benefit from perceiving both visual articulators while comprehending SiN, and whether this is modulated by working memory and inhibitory control. Twenty-eight younger and 28 older adults performed a word recognition task in three visual contexts: mouth blurred (speech-only), visible speech, or visible speech + iconic gesture. The speech signal was either clear or embedded in multitalker babble. Additionally, there were two visual-only conditions (visible speech, visible speech + gesture). Accuracy levels for both age groups were higher when both visual articulators were present compared to either one or none. However, older adults received a significantly smaller benefit than younger adults, although they performed equally well in speech-only and visual-only word recognition. Individual differences in verbal working memory and inhibitory control partly accounted for age-related performance differences. To conclude, perceiving iconic gestures in addition to visible speech improves younger and older adults’ comprehension of SiN. Yet, the ability to benefit from this additional visual information is modulated by age and verbal working memory. Future research will have to show whether these findings extend beyond the single word level.

    Additional information

    supplementary material
  • Ter Bekke, M., Drijvers, L., & Holler, J. (2020). The predictive potential of hand gestures during conversation: An investigation of the timing of gestures in relation to speech. In Proceedings of the 7th GESPIN - Gesture and Speech in Interaction Conference. Stockholm: KTH Royal Institute of Technology.

    Abstract

    In face-to-face conversation, recipients might use the bodily movements of the speaker (e.g. gestures) to facilitate language processing. It has been suggested that one way through which this facilitation may happen is prediction. However, for this to be possible, gestures would need to precede speech, and it is unclear whether this is true during natural conversation. In a corpus of Dutch conversations, we annotated hand gestures that represent semantic information and occurred during questions, and the word(s) which corresponded most closely to the gesturally depicted meaning. Thus, we tested whether representational gestures temporally precede their lexical affiliates. Further, to see whether preceding gestures may indeed facilitate language processing, we asked whether the gesture-speech asynchrony predicts the response time to the question the gesture is part of. Gestures and their strokes (most meaningful movement component) indeed preceded the corresponding lexical information, thus demonstrating their predictive potential. However, while questions with gestures got faster responses than questions without, there was no evidence that questions with larger gesture-speech asynchronies get faster responses. These results suggest that gestures indeed have the potential to facilitate predictive language processing, but further analyses on larger datasets are needed to test for links between asynchrony and processing advantages.
  • Drijvers, L., Vaitonyte, J., & Ozyurek, A. (2019). Degree of language experience modulates visual attention to visible speech and iconic gestures during clear and degraded speech comprehension. Cognitive Science, 43: e12789. doi:10.1111/cogs.12789.

    Abstract

    Visual information conveyed by iconic hand gestures and visible speech can enhance speech comprehension under adverse listening conditions for both native and non‐native listeners. However, how a listener allocates visual attention to these articulators during speech comprehension is unknown. We used eye‐tracking to investigate whether and how native and highly proficient non‐native listeners of Dutch allocated overt eye gaze to visible speech and gestures during clear and degraded speech comprehension. Participants watched video clips of an actress uttering a clear or degraded (6‐band noise‐vocoded) action verb while performing a gesture or not, and were asked to indicate the word they heard in a cued‐recall task. Gestural enhancement was the largest (i.e., a relative reduction in reaction time cost) when speech was degraded for all listeners, but it was stronger for native listeners. Both native and non‐native listeners mostly gazed at the face during comprehension, but non‐native listeners gazed more often at gestures than native listeners. However, only native but not non‐native listeners' gaze allocation to gestures predicted gestural benefit during degraded speech comprehension. We conclude that non‐native listeners might gaze at gesture more as it might be more challenging for non‐native listeners to resolve the degraded auditory cues and couple those cues to phonological information that is conveyed by visible speech. This diminished phonological knowledge might hinder the use of semantic information that is conveyed by gestures for non‐native compared to native listeners. Our results demonstrate that the degree of language experience impacts overt visual attention to visual articulators, resulting in different visual benefits for native versus non‐native listeners.

    Additional information

    Supporting information
  • Drijvers, L., Van der Plas, M., Ozyurek, A., & Jensen, O. (2019). Native and non-native listeners show similar yet distinct oscillatory dynamics when using gestures to access speech in noise. NeuroImage, 194, 55-67. doi:10.1016/j.neuroimage.2019.03.032.

    Abstract

    Listeners are often challenged by adverse listening conditions during language comprehension induced by external factors, such as noise, but also internal factors, such as being a non-native listener. Visible cues, such as semantic information conveyed by iconic gestures, can enhance language comprehension in such situations. Using magnetoencephalography (MEG) we investigated whether spatiotemporal oscillatory dynamics can predict a listener's benefit of iconic gestures during language comprehension in both internally (non-native versus native listeners) and externally (clear/degraded speech) induced adverse listening conditions. Proficient non-native speakers of Dutch were presented with videos in which an actress uttered a degraded or clear verb, accompanied by a gesture or not, and completed a cued-recall task after every video. The behavioral and oscillatory results obtained from non-native listeners were compared to an MEG study where we presented the same stimuli to native listeners (Drijvers et al., 2018a). Non-native listeners demonstrated a similar gestural enhancement effect as native listeners, but overall scored significantly slower on the cued-recall task. In both native and non-native listeners, an alpha/beta power suppression revealed engagement of the extended language network, motor and visual regions during gestural enhancement of degraded speech comprehension, suggesting similar core processes that support unification and lexical access processes. An individual's alpha/beta power modulation predicted the gestural benefit a listener experienced during degraded speech comprehension. Importantly, however, non-native listeners showed less engagement of the mouth area of the primary somatosensory cortex, left insula (beta), LIFG and ATL (alpha) than native listeners, which suggests that non-native listeners might be hindered in processing the degraded phonological cues and coupling them to the semantic information conveyed by the gesture. Native and non-native listeners thus demonstrated similar yet distinct spatiotemporal oscillatory dynamics when recruiting visual cues to disambiguate degraded speech.

    Additional information

    1-s2.0-S1053811919302216-mmc1.docx
  • Drijvers, L. (2019). On the oscillatory dynamics underlying speech-gesture integration in clear and adverse listening conditions. PhD Thesis, Radboud University Nijmegen, Nijmegen.
  • Drijvers, L., & Trujillo, J. P. (2018). Commentary: Transcranial magnetic stimulation over left inferior frontal and posterior temporal cortex disrupts gesture-speech integration. Frontiers in Human Neuroscience, 12: 256. doi:10.3389/fnhum.2018.00256.

    Abstract

    A commentary on Transcranial Magnetic Stimulation over Left Inferior Frontal and Posterior Temporal Cortex Disrupts Gesture-Speech Integration by Zhao, W., Riggs, K., Schindler, I., and Holle, H. (2018). J. Neurosci. 10, 1748–1717. doi: 10.1523/JNEUROSCI.1748-17.2017
  • Drijvers, L., Ozyurek, A., & Jensen, O. (2018). Alpha and beta oscillations index semantic congruency between speech and gestures in clear and degraded speech. Journal of Cognitive Neuroscience, 30(8), 1086-1097. doi:10.1162/jocn_a_01301.

    Abstract

    Previous work revealed that visual semantic information conveyed by gestures can enhance degraded speech comprehension, but the mechanisms underlying these integration processes under adverse listening conditions remain poorly understood. We used MEG to investigate how oscillatory dynamics support speech–gesture integration when integration load is manipulated by auditory (e.g., speech degradation) and visual semantic (e.g., gesture congruency) factors. Participants were presented with videos of an actress uttering an action verb in clear or degraded speech, accompanied by a matching (mixing gesture + “mixing”) or mismatching (drinking gesture + “walking”) gesture. In clear speech, alpha/beta power was more suppressed in the left inferior frontal gyrus and motor and visual cortices when integration load increased in response to mismatching versus matching gestures. In degraded speech, beta power was less suppressed over posterior STS and medial temporal lobe for mismatching compared with matching gestures, showing that integration load was lowest when speech was degraded and mismatching gestures could not be integrated and disambiguate the degraded signal. Our results thus provide novel insights on how low-frequency oscillatory modulations in different parts of the cortex support the semantic audiovisual integration of gestures in clear and degraded speech: When speech is clear, the left inferior frontal gyrus and motor and visual cortices engage because higher-level semantic information increases semantic integration load. When speech is degraded, posterior STS/middle temporal gyrus and medial temporal lobe are less engaged because integration load is lowest when visual semantic information does not aid lexical retrieval and speech and gestures cannot be integrated.
  • Drijvers, L., Ozyurek, A., & Jensen, O. (2018). Hearing and seeing meaning in noise: Alpha, beta and gamma oscillations predict gestural enhancement of degraded speech comprehension. Human Brain Mapping, 39(5), 2075-2087. doi:10.1002/hbm.23987.

    Abstract

    During face-to-face communication, listeners integrate speech with gestures. The semantic information conveyed by iconic gestures (e.g., a drinking gesture) can aid speech comprehension in adverse listening conditions. In this magnetoencephalography (MEG) study, we investigated the spatiotemporal neural oscillatory activity associated with gestural enhancement of degraded speech comprehension. Participants watched videos of an actress uttering clear or degraded speech, accompanied by a gesture or not and completed a cued-recall task after watching every video. When gestures semantically disambiguated degraded speech comprehension, an alpha and beta power suppression and a gamma power increase revealed engagement and active processing in the hand-area of the motor cortex, the extended language network (LIFG/pSTS/STG/MTG), medial temporal lobe, and occipital regions. These observed low- and high-frequency oscillatory modulations in these areas support general unification, integration and lexical access processes during online language comprehension, and simulation of and increased visual attention to manual gestures over time. All individual oscillatory power modulations associated with gestural enhancement of degraded speech comprehension predicted a listener's correct disambiguation of the degraded verb after watching the videos. Our results thus go beyond the previously proposed role of oscillatory dynamics in unimodal degraded speech comprehension and provide first evidence for the role of low- and high-frequency oscillations in predicting the integration of auditory and visual information at a semantic level.

    Additional information

    hbm23987-sup-0001-suppinfo01.docx
  • Drijvers, L., & Ozyurek, A. (2018). Native language status of the listener modulates the neural integration of speech and iconic gestures in clear and adverse listening conditions. Brain and Language, 177-178, 7-17. doi:10.1016/j.bandl.2018.01.003.

    Abstract

    Native listeners neurally integrate iconic gestures with speech, which can enhance degraded speech comprehension. However, it is unknown how non-native listeners neurally integrate speech and gestures, as they might process visual semantic context differently than natives. We recorded EEG while native and highly-proficient non-native listeners watched videos of an actress uttering an action verb in clear or degraded speech, accompanied by a matching ('to drive'+driving gesture) or mismatching gesture ('to drink'+mixing gesture). Degraded speech elicited an enhanced N400 amplitude compared to clear speech in both groups, revealing an increase in neural resources needed to resolve the spoken input. A larger N400 effect was found in clear speech for non-natives compared to natives, but in degraded speech only for natives. Non-native listeners might thus process gesture more strongly than natives when speech is clear, but need more auditory cues to facilitate access to gestural semantic information when speech is degraded.
  • Drijvers, L., & Ozyurek, A. (2017). Visual context enhanced: The joint contribution of iconic gestures and visible speech to degraded speech comprehension. Journal of Speech, Language, and Hearing Research, 60, 212-222. doi:10.1044/2016_JSLHR-H-16-0101.

    Abstract

    Purpose This study investigated whether and to what extent iconic co-speech gestures contribute to information from visible speech to enhance degraded speech comprehension at different levels of noise-vocoding. Previous studies of the contributions of these 2 visual articulators to speech comprehension have only been performed separately. Method Twenty participants watched videos of an actress uttering an action verb and completed a free-recall task. The videos were presented in 3 speech conditions (2-band noise-vocoding, 6-band noise-vocoding, clear), 3 multimodal conditions (speech + lips blurred, speech + visible speech, speech + visible speech + gesture), and 2 visual-only conditions (visible speech, visible speech + gesture). Results Accuracy levels were higher when both visual articulators were present compared with 1 or none. The enhancement effects of (a) visible speech, (b) gestural information on top of visible speech, and (c) both visible speech and iconic gestures were larger in 6-band than 2-band noise-vocoding or visual-only conditions. Gestural enhancement in 2-band noise-vocoding did not differ from gestural enhancement in visual-only conditions.
  • Drijvers, L., Mulder, K., & Ernestus, M. (2016). Alpha and gamma band oscillations index differential processing of acoustically reduced and full forms. Brain and Language, 153-154, 27-37. doi:10.1016/j.bandl.2016.01.003.

    Abstract

    Reduced forms like yeshay for yesterday often occur in conversations. Previous behavioral research reported a processing advantage for full over reduced forms. The present study investigated whether this processing advantage is reflected in a modulation of alpha (8–12 Hz) and gamma (30+ Hz) band activity. In three electrophysiological experiments, participants listened to full and reduced forms in isolation (Experiment 1), sentence-final position (Experiment 2), or mid-sentence position (Experiment 3). Alpha power was larger in response to reduced forms than to full forms, but only in Experiments 1 and 2. We interpret these increases in alpha power as reflections of higher auditory cognitive load. In all experiments, gamma power only increased in response to full forms, which we interpret as showing that lexical activation spreads more quickly through the semantic network for full than for reduced forms. These results confirm a processing advantage for full forms, especially in non-medial sentence position.
  • Drijvers, L., Zaadnoordijk, L., & Dingemanse, M. (2015). Sound-symbolism is disrupted in dyslexia: Implications for the role of cross-modal abstraction processes. In D. Noelle, R. Dale, A. S. Warlaumont, J. Yoshimi, T. Matlock, C. D. Jennings, & P. P. Maglio (Eds.), Proceedings of the 37th Annual Meeting of the Cognitive Science Society (CogSci 2015) (pp. 602-607). Austin, Tx: Cognitive Science Society.

    Abstract

    Research into sound-symbolism has shown that people can consistently associate certain pseudo-words with certain referents; for instance, pseudo-words with rounded vowels and sonorant consonants are linked to round shapes, while pseudowords with unrounded vowels and obstruents (with a noncontinuous airflow), are associated with sharp shapes. Such sound-symbolic associations have been proposed to arise from cross-modal abstraction processes. Here we assess the link between sound-symbolism and cross-modal abstraction by testing dyslexic individuals’ ability to make sound-symbolic associations. Dyslexic individuals are known to have deficiencies in cross-modal processing. We find that dyslexic individuals are impaired in their ability to make sound-symbolic associations relative to the controls. Our results shed light on the cognitive underpinnings of sound-symbolism by providing novel evidence for the role —and disruptability— of cross-modal abstraction processes in sound-symbolic eects.

Share this page