Displaying 1 - 14 of 14
-
Emmendorfer, A. K., & Holler, J. (2025). Facial signals shape predictions about the nature of upcoming conversational responses. Scientific Reports, 15: 1381. doi:10.1038/s41598-025-85192-y.
Abstract
Increasing evidence suggests that interlocutors use visual communicative signals to form predictions about unfolding utterances, but there is little data on the predictive potential of facial signals in conversation. In an online experiment with virtual agents, we examine whether facial signals produced by an addressee may allow speakers to anticipate the response to a question before it is given. Participants (n = 80) viewed videos of short conversation fragments between two virtual humans. Each fragment ended with the Questioner asking a question, followed by a pause during which the Responder looked either straight at the Questioner (baseline), or averted their gaze, or accompanied the straight gaze with one of the following facial signals: brow raise, brow frown, nose wrinkle, smile, squint, mouth corner pulled back (dimpler). Participants then indicated on a 6-point scale whether they expected a “yes” or “no” response. Analyses revealed that all signals received different ratings relative to the baseline: brow raises, dimplers, and smiles were associated with more positive responses, gaze aversions, brow frowns, nose wrinkles, and squints with more negative responses. Qur findings show that interlocutors may form strong associations between facial signals and upcoming responses to questions, highlighting their predictive potential in face-to-face conversation.Additional information
supplementary materials -
Hömke, P., Levinson, S. C., Emmendorfer, A. K., & Holler, J. (2025). Eyebrow movements as signals of communicative problems in human face-to-face interaction. Royal Society Open Science, 12(3): 241632. doi:10.1098/rsos.241632.
Abstract
Repair is a core building block of human communication, allowing us to address problems of understanding in conversation. Past research has uncovered the basic mechanisms by which interactants signal and solve such problems. However, the focus has been on verbal interaction, neglecting the fact that human communication is inherently multimodal. Here, we focus on a visual signal particularly prevalent in signalling problems of understanding: eyebrow furrows and raises. We present, first, a corpus study showing that differences in eyebrow actions (furrows versus raises) were systematically associated with differences in the format of verbal repair initiations. Second, we present a follow-up study using an avatar that allowed us to test the causal consequences of addressee eyebrow movements, zooming into the effect of eyebrow furrows as signals of trouble in understanding in particular. The results revealed that addressees’ eyebrow furrows have a striking effect on speakers’ speech, leading speakers to produce answers to questions several seconds longer than when not perceiving addressee eyebrow furrows while speaking. Together, the findings demonstrate that eyebrow movements play a communicative role in initiating repair during conversation rather than being merely epiphenomenal and that their occurrence can critically influence linguistic behaviour. Thus, eyebrow movements should be considered core coordination devices in human conversational interaction.Additional information
link to preprint -
Ter Bekke, M., Drijvers, L., & Holler, J. (2025). Co-speech hand gestures are used to predict upcoming meaning. Psychological Science. Advance online publication. doi:10.1177/09567976251331041.
Abstract
In face-to-face conversation, people use speech and gesture to convey meaning. Seeing gestures alongside speech facilitates comprehenders’ language processing, but crucially, the mechanisms underlying this facilitation remain unclear. We investigated whether comprehenders use the semantic information in gestures, typically preceding related speech, to predict upcoming meaning. Dutch adults listened to questions asked by a virtual avatar. Questions were accompanied by an iconic gesture (e.g., typing) or meaningless control movement (e.g., arm scratch) followed by a short pause and target word (e.g., “type”). A Cloze experiment showed that gestures improved explicit predictions of upcoming target words. Moreover, an EEG experiment showed that gestures reduced alpha and beta power during the pause, indicating anticipation, and reduced N400 amplitudes, demonstrating facilitated semantic processing. Thus, comprehenders use iconic gestures to predict upcoming meaning. Theories of linguistic prediction should incorporate communicative bodily signals as predictive cues to capture how language is processed in face-to-face interaction.Additional information
supplementary material -
Tilston, O., Holler, J., & Bangerter, A. (2025). Opening social interactions: The coordination of approach, gaze, speech and handshakes during greetings. Cognitive Science, 49(2): e70049. doi:10.1111/cogs.70049.
Abstract
Despite the importance of greetings for opening social interactions, their multimodal coordination processes remain poorly understood. We used a naturalistic, lab-based setup where pairs of unacquainted participants approached and greeted each other while unaware their greeting behavior was studied. We measured the prevalence and time course of multimodal behaviors potentially culminating in a handshake, including motor behaviors (e.g., walking, standing up, hand movements like raise, grasp, and retraction), gaze patterns (using eye tracking glasses), and speech (close and distant verbal salutations). We further manipulated the visibility of partners’ eyes to test its effect on gaze. Our findings reveal that gaze to a partner's face increases over the course of a greeting, but is partly averted during approach and is influenced by the visibility of partners’ eyes. Gaze helps coordinate handshakes, by signaling intent and guiding the grasp. The timing of adjacency pairs in verbal salutations is comparable to the precision of floor transitions in the main body of conversations, and varies according to greeting phase, with distant salutation pair parts featuring more gaps and close salutation pair parts featuring more overlap. Gender composition and a range of multimodal behaviors affect whether pairs chose to shake hands or not. These findings fill several gaps in our understanding of greetings and provide avenues for future research, including advancements in social robotics and human−robot interaction. -
Trujillo, J. P., Dyer, R. M. K., & Holler, J. (2025). Dyadic differences in empathy scores are associated with kinematic similarity during conversational question-answer pairs. Discourse Processes, 62(3), 195-213. doi:10.1080/0163853X.2025.2467605.
Abstract
During conversation, speakers coordinate and synergize their behaviors at multiple levels, and in different ways. The extent to which individuals converge or diverge in their behaviors during interaction may relate to interpersonal differences relevant to social interaction, such as empathy as measured by the empathy quotient (EQ). An association between interpersonal difference in empathy and interpersonal entrainment could help to throw light on how interlocutor characteristics influence interpersonal entrainment. We investigated this possibility in a corpus of unconstrained conversation between dyads. We used dynamic time warping to quantify entrainment between interlocutors of head motion, hand motion, and maximum speech f0 during question–response sequences. We additionally calculated interlocutor differences in EQ scores. We found that, for both head and hand motion, greater difference in EQ was associated with higher entrainment. Thus, we consider that people who are dissimilar in EQ may need to “ground” their interaction with low-level movement entrainment. There was no significant relationship between f0 entrainment and EQ score differences. -
Trujillo, J. P., & Holler, J. (2025). Multimodal information density is highest in question beginnings, and early entropy is associated with fewer but longer visual signals. Discourse Processes, 62(2), 69-88. doi:10.1080/0163853X.2024.2413314.
Abstract
When engaged in spoken conversation, speakers convey meaning using both speech and visual signals, such as facial expressions and manual gestures. An important question is how information is distributed in utterances during face-to-face interaction when information from visual signals is also present. In a corpus of casual Dutch face-to-face conversations, we focus on spoken questions in particular because they occur frequently, thus constituting core building blocks of conversation. We quantified information density (i.e. lexical entropy and surprisal) and the number and relative duration of facial and manual signals. We tested whether lexical information density or the number of visual signals differed between the first and last halves of questions, as well as whether the number of visual signals occurring in the less-predictable portion of a question was associated with the lexical information density of the same portion of the question in a systematic manner. We found that information density, as well as number of visual signals, were higher in the first half of questions, and specifically lexical entropy was associated with fewer, but longer visual signals. The multimodal front-loading of questions and the complementary distribution of visual signals and high entropy words in Dutch casual face-to-face conversations may have implications for the parallel processes of utterance comprehension and response planning during turn-taking.Additional information
supplemental material -
Holler, J., Kendrick, K. H., Casillas, M., & Levinson, S. C. (2015). Editorial: Turn-taking in human communicative interaction. Frontiers in Psychology, 6: 1919. doi:10.3389/fpsyg.2015.01919.
-
Holler, J., Kokal, I., Toni, I., Hagoort, P., Kelly, S. D., & Ozyurek, A. (2015). Eye’m talking to you: Speakers’ gaze direction modulates co-speech gesture processing in the right MTG. Social Cognitive & Affective Neuroscience, 10, 255-261. doi:10.1093/scan/nsu047.
Abstract
Recipients process information from speech and co-speech gestures, but it is currently unknown how this processing is influenced by the presence of other important social cues, especially gaze direction, a marker of communicative intent. Such cues may modulate neural activity in regions associated either with the processing of ostensive cues, such as eye gaze, or with the processing of semantic information, provided by speech and gesture.
Participants were scanned (fMRI) while taking part in triadic communication involving two recipients and a speaker. The speaker uttered sentences that
were and were not accompanied by complementary iconic gestures. Crucially, the speaker alternated her gaze direction, thus creating two recipient roles: addressed (direct gaze) vs unaddressed (averted gaze) recipient. The comprehension of Speech&Gesture relative to SpeechOnly utterances recruited middle occipital, middle temporal and inferior frontal gyri, bilaterally. The calcarine sulcus and posterior cingulate cortex were sensitive to differences between direct and averted gaze. Most importantly, Speech&Gesture utterances, but not SpeechOnly utterances, produced additional activity in the right middle temporal gyrus when participants were addressed. Marking communicative intent with gaze direction modulates the processing of speech–gesture utterances in cerebral areas typically associated with the semantic processing of multi-modal communicative acts. -
Holler, J., & Kendrick, K. H. (2015). Unaddressed participants’ gaze in multi-person interaction: Optimizing recipiency. Frontiers in Psychology, 6: 98. doi:10.3389/fpsyg.2015.00098.
Abstract
One of the most intriguing aspects of human communication is its turn-taking system. It requires the ability to process on-going turns at talk while planning the next, and to launch this next turn without considerable overlap or delay. Recent research has investigated the eye movements of observers of dialogues to gain insight into how we process turns at talk. More specifically, this research has focused on the extent to which we are able to anticipate the end of current and the beginning of next turns. At the same time, there has been a call for shifting experimental paradigms exploring social-cognitive processes away from passive observation towards online processing. Here, we present research that responds to this call by situating state-of-the-art technology for tracking interlocutors’ eye movements within spontaneous, face-to-face conversation. Each conversation involved three native speakers of English. The analysis focused on question-response sequences involving just two of those participants, thus rendering the third momentarily unaddressed. Temporal analyses of the unaddressed participants’ gaze shifts from current to next speaker revealed that unaddressed participants are able to anticipate next turns, and moreover, that they often shift their gaze towards the next speaker before the current turn ends. However, an analysis of the complex structure of turns at talk revealed that the planning of these gaze shifts virtually coincides with the points at which the turns first become recog-nizable as possibly complete. We argue that the timing of these eye movements is governed by an organizational principle whereby unaddressed participants shift their gaze at a point that appears interactionally most optimal: It provides unaddressed participants with access to much of the visual, bodily behavior that accompanies both the current speaker’s and the next speaker’s turn, and it allows them to display recipiency with regard to both speakers’ turns. -
Kelly, S., Healey, M., Ozyurek, A., & Holler, J. (2015). The processing of speech, gesture and action during language comprehension. Psychonomic Bulletin & Review, 22, 517-523. doi:10.3758/s13423-014-0681-7.
Abstract
Hand gestures and speech form a single integrated system of meaning during language comprehension, but is gesture processed with speech in a unique fashion? We had subjects watch multimodal videos that presented auditory (words) and visual (gestures and actions on objects) information. Half of the subjects related the audio information to a written prime presented before the video, and the other half related the visual information to the written prime. For half of the multimodal video stimuli, the audio and visual information contents were congruent, and for the other half, they were incongruent. For all subjects, stimuli in which the gestures and actions were incongruent with the speech produced more errors and longer response times than did stimuli that were congruent, but this effect was less prominent for speech-action stimuli than for speech-gesture stimuli. However, subjects focusing on visual targets were more accurate when processing actions than gestures. These results suggest that although actions may be easier to process than gestures, gestures may be more tightly tied to the processing of accompanying speech. -
Peeters, D., Chu, M., Holler, J., Hagoort, P., & Ozyurek, A. (2015). Electrophysiological and kinematic correlates of communicative intent in the planning and production of pointing gestures and speech. Journal of Cognitive Neuroscience, 27(12), 2352-2368. doi:10.1162/jocn_a_00865.
Abstract
In everyday human communication, we often express our communicative intentions by manually pointing out referents in the material world around us to an addressee, often in tight synchronization with referential speech. This study investigated whether and how the kinematic form of index finger pointing gestures is shaped by the gesturer's communicative intentions and how this is modulated by the presence of concurrently produced speech. Furthermore, we explored the neural mechanisms underpinning the planning of communicative pointing gestures and speech. Two experiments were carried out in which participants pointed at referents for an addressee while the informativeness of their gestures and speech was varied. Kinematic and electrophysiological data were recorded online. It was found that participants prolonged the duration of the stroke and poststroke hold phase of their gesture to be more communicative, in particular when the gesture was carrying the main informational burden in their multimodal utterance. Frontal and P300 effects in the ERPs suggested the importance of intentional and modality-independent attentional mechanisms during the planning phase of informative pointing gestures. These findings contribute to a better understanding of the complex interplay between action, attention, intention, and language in the production of pointing gestures, a communicative act core to human interaction. -
Rowbotham, S., Lloyd, D. M., Holler, J., & Wearden, A. (2015). Externalizing the private experience of pain: A role for co-speech gestures in pain communication? Health Communication, 30(1), 70-80. doi:10.1080/10410236.2013.836070.
Abstract
Despite the importance of effective pain communication, talking about pain represents a major challenge for patients and clinicians because pain is a private and subjective experience. Focusing primarily on acute pain, this article considers the limitations of current methods of obtaining information about the sensory characteristics of pain and suggests that spontaneously produced “co-speech hand gestures” may constitute an important source of information here. Although this is a relatively new area of research, we present recent empirical evidence that reveals that co-speech gestures contain important information about pain that can both add to and clarify speech. Following this, we discuss how these findings might eventually lead to a greater understanding of the sensory characteristics of pain, and to improvements in treatment and support for pain sufferers. We hope that this article will stimulate further research and discussion of this previously overlooked dimension of pain communication -
Schubotz, L., Holler, J., & Ozyurek, A. (2015). Age-related differences in multi-modal audience design: Young, but not old speakers, adapt speech and gestures to their addressee's knowledge. In G. Ferré, & M. Tutton (
Eds. ), Proceedings of the 4th GESPIN - Gesture & Speech in Interaction Conference (pp. 211-216). Nantes: Université of Nantes.Abstract
Speakers can adapt their speech and co-speech gestures for
addressees. Here, we investigate whether this ability is
modulated by age. Younger and older adults participated in a
comic narration task in which one participant (the speaker)
narrated six short comic stories to another participant (the
addressee). One half of each story was known to both participants, the other half only to the speaker. Younger but
not older speakers used more words and gestures when narrating novel story content as opposed to known content.
We discuss cognitive and pragmatic explanations of these findings and relate them to theories of gesture production. -
Holler, J., & Beattie, G. (2002). A micro-analytic investigation of how iconic gestures and speech represent core semantic features in talk. Semiotica, 142, 31-69.
Share this page