Esam Ghaleb

I am an AI researcher with a background in computer science and engineering. I develop computational models of multimodal language and behaviour: how speech, text, gesture, sign, facial expressions, eye gaze, and whole-body movement jointly encode meaning in interaction. Previously, I worked at the University of Amsterdam and Maastricht University on linguistic–gestural alignment and explainable multimodal modelling of human behaviour.

 

I am currently a research staff member in the Multimodal Language Department at the Max Planck Institute for Psycholinguistics, and I lead the department’s Multimodal Modelling Cluster. My work develops machine-learning methods for the segmentation, coding, and representation of visual communicative signals from motion-capture and video data, and uses learned representations as testbeds for theories of multimodal language across languages and interactional settings. I also study how large language models and multimodal language models integrate and generate multimodal behaviour, and I develop generative models of gesture for virtual agents and avatars. Recent work includes an NWO XS-funded project, Grounded Gesture Generation in Context: Object- and Interaction-Aware Generative AI Models of Language Use.

 

For more information and updated lists of papers, please visit my personal website: https://esamghaleb.github.io/

Share this page