Are we seeing eye-to-eye?: Gaze allocation to a humanoid robot during conversation

Smilde, F., & Mishra, C. (2026). Are we seeing eye-to-eye?: Gaze allocation to a humanoid robot during conversation. In L. Baillie, W. D. Smart, M. De Graaf, M. Gombolay, & I. Torre (Eds.), HRI Companion '26: Companion Proceedings of the 21st ACM/IEEE International Conference on Human-Robot Interaction (pp. 1065-1069). New York: Association for Computing Machinery. doi:10.1145/3776734.3794558.
Gaze is a key non-verbal cue in face-to-face interaction, yet we know relatively little about how people visually explore a robot’s face during conversation. In human-human interactions (HHI), gaze allocation is shaped by conversational role and task demands: speakers typically avert their gaze from their partner’s face more than listeners do, and listeners often shift gaze from the eyes to the mouth to support speech understanding. In human-robot interactions (HRI), it is often implicitly assumed that gaze to humanoid robots follows similar patterns, but this has rarely been tested quantitatively at the level of specific facial regions. In this late-breaking report, we report a secondary analysis of an existing HRI dataset with usable eye-tracking data from 31 participants who took part in semi-structured interviews with a social robot (Furhat). Using MediaPipe Face Mesh on participant’s egocentric video from eye tracking glasses, we segmented the robot’s face into eye, mouth, and full-face regions of interest (ROI), and quantified how participants distributed their gaze at each ROI over the entire interaction, and separately for speaking and listening. Participants spent most of the interaction looking at the robot’s face; within the face, the eyes and mouth were the main targets, and gaze to these regions increased during listening, especially for the mouth. This pattern aligns with the central findings from HHI and offers empirical evidence for assumed similarities in gaze allocation between HHI and HRI. In an exploratory analysis, we additionally examined how the robot’s own gaze behaviour, with or without human-like gaze aversions, shaped gaze to the eyes and mouth. We discuss how these findings inform the interpretation of gaze as an implicit engagement cue in HRI. Finally, we provide baseline references and show how ROI-based analyses can enrich future gaze studies in HRI.
Publication type
Proceedings paper
Publication date
2026

Share this page