Displaying 1 - 19 of 19
-
Eijk, L., Rasenberg, M., Arnese, F., Blokpoel, M., Dingemanse, M., Doeller, C. F., Ernestus, M., Holler, J., Milivojevic, B., Özyürek, A., Pouw, W., Van Rooij, I., Schriefers, H., Toni, I., Trujillo, J. P., & Bögels, S. (2022). The CABB dataset: A multimodal corpus of communicative interactions for behavioural and neural analyses. NeuroImage, 264: 119734. doi:10.1016/j.neuroimage.2022.119734.
Abstract
We present a dataset of behavioural and fMRI observations acquired in the context of humans involved in multimodal referential communication. The dataset contains audio/video and motion-tracking recordings of face-to-face, task-based communicative interactions in Dutch, as well as behavioural and neural correlates of participants’ representations of dialogue referents. Seventy-one pairs of unacquainted participants performed two interleaved interactional tasks in which they described and located 16 novel geometrical objects (i.e., Fribbles) yielding spontaneous interactions of about one hour. We share high-quality video (from three cameras), audio (from head-mounted microphones), and motion-tracking (Kinect) data, as well as speech transcripts of the interactions. Before and after engaging in the face-to-face communicative interactions, participants’ individual representations of the 16 Fribbles were estimated. Behaviourally, participants provided a written description (one to three words) for each Fribble and positioned them along 29 independent conceptual dimensions (e.g., rounded, human, audible). Neurally, fMRI signal evoked by each Fribble was measured during a one-back working-memory task. To enable functional hyperalignment across participants, the dataset also includes fMRI measurements obtained during visual presentation of eight animated movies (35 minutes total). We present analyses for the various types of data demonstrating their quality and consistency with earlier research. Besides high-resolution multimodal interactional data, this dataset includes different correlates of communicative referents, obtained before and after face-to-face dialogue, allowing for novel investigations into the relation between communicative behaviours and the representational space shared by communicators. This unique combination of data can be used for research in neuroscience, psychology, linguistics, and beyond. -
Owoyele, B., Trujillo, J. P., De Melo, G., & Pouw, W. (2022). Masked-Piper: Masking personal identities in visual recordings while preserving multimodal information. SoftwareX, 20: 101236. doi:10.1016/j.softx.2022.101236.
Abstract
In this increasingly data-rich world, visual recordings of human behavior are often unable to be shared due to concerns about privacy. Consequently, data sharing in fields such as behavioral science, multimodal communication, and human movement research is often limited. In addition, in legal and other non-scientific contexts, privacy-related concerns may preclude the sharing of video recordings and thus remove the rich multimodal context that humans recruit to communicate. Minimizing the risk of identity exposure while preserving critical behavioral information would maximize utility of public resources (e.g., research grants) and time invested in audio–visual research. Here we present an open-source computer vision tool that masks the identities of humans while maintaining rich information about communicative body movements. Furthermore, this masking tool can be easily applied to many videos, leveraging computational tools to augment the reproducibility and accessibility of behavioral research. The tool is designed for researchers and practitioners engaged in kinematic and affective research. Application areas include teaching/education, communication and human movement research, CCTV, and legal contexts.Additional information
setup and usage -
Pearson, L., & Pouw, W. (2022). Gesture–vocal coupling in Karnatak music performance: A neuro–bodily distributed aesthetic entanglement. Annals of the New York Academy of Sciences, 1515(1), 219-236. doi:10.1111/nyas.14806.
Abstract
In many musical styles, vocalists manually gesture while they sing. Coupling between gesture kinematics and vocalization has been examined in speech contexts, but it is an open question how these couple in music making. We examine this in a corpus of South Indian, Karnatak vocal music that includes motion-capture data. Through peak magnitude analysis (linear mixed regression) and continuous time-series analyses (generalized additive modeling), we assessed whether vocal trajectories around peaks in vertical velocity, speed, or acceleration were coupling with changes in vocal acoustics (namely, F0 and amplitude). Kinematic coupling was stronger for F0 change versus amplitude, pointing to F0's musical significance. Acceleration was the most predictive for F0 change and had the most reliable magnitude coupling, showing a one-third power relation. That acceleration, rather than other kinematics, is maximally predictive for vocalization is interesting because acceleration entails force transfers onto the body. As a theoretical contribution, we argue that gesturing in musical contexts should be understood in relation to the physical connections between gesturing and vocal production that are brought into harmony with the vocalists’ (enculturated) performance goals. Gesture–vocal coupling should, therefore, be viewed as a neuro–bodily distributed aesthetic entanglement.Additional information
tables -
Pouw, W., & Holler, J. (2022). Timing in conversation is dynamically adjusted turn by turn in dyadic telephone conversations. Cognition, 222: 105015. doi:10.1016/j.cognition.2022.105015.
Abstract
Conversational turn taking in humans involves incredibly rapid responding. The timing mechanisms underpinning such responses have been heavily debated, including questions such as who is doing the timing. Similar to findings on rhythmic tapping to a metronome, we show that floor transfer offsets (FTOs) in telephone conversations are serially dependent, such that FTOs are lag-1 negatively autocorrelated. Finding this serial dependence on a turn-by-turn basis (lag-1) rather than on the basis of two or more turns, suggests a counter-adjustment mechanism operating at the level of the dyad in FTOs during telephone conversations, rather than a more individualistic self-adjustment within speakers. This finding, if replicated, has major implications for models describing turn taking, and confirms the joint, dyadic nature of human conversational dynamics. Future research is needed to see how pervasive serial dependencies in FTOs are, such as for example in richer communicative face-to-face contexts where visual signals affect conversational timing. -
Pouw, W., & Dixon, J. A. (2022). What you hear and see specifies the perception of a limb-respiratory-vocal act. Proceedings of the Royal Society B: Biological Sciences, 289(1979): 20221026. doi:10.1098/rspb.2022.1026.
-
Pouw, W., Harrison, S. J., & Dixon, J. A. (2022). The importance of visual control and biomechanics in the regulation of gesture-speech synchrony for an individual deprived of proprioceptive feedback of body position. Scientific Reports, 12: 14775. doi:10.1038/s41598-022-18300-x.
Abstract
Do communicative actions such as gestures fundamentally differ in their control mechanisms from other actions? Evidence for such fundamental differences comes from a classic gesture-speech coordination experiment performed with a person (IW) with deafferentation (McNeill, 2005). Although IW has lost both his primary source of information about body position (i.e., proprioception) and discriminative touch from the neck down, his gesture-speech coordination has been reported to be largely unaffected, even if his vision is blocked. This is surprising because, without vision, his object-directed actions almost completely break down. We examine the hypothesis that IW’s gesture-speech coordination is supported by the biomechanical effects of gesturing on head posture and speech. We find that when vision is blocked, there are micro-scale increases in gesture-speech timing variability, consistent with IW’s reported experience that gesturing is difficult without vision. Supporting the hypothesis that IW exploits biomechanical consequences of the act of gesturing, we find that: (1) gestures with larger physical impulses co-occur with greater head movement, (2) gesture-speech synchrony relates to larger gesture-concurrent head movements (i.e. for bimanual gestures), (3) when vision is blocked, gestures generate more physical impulse, and (4) moments of acoustic prominence couple more with peaks of physical impulse when vision is blocked. It can be concluded that IW’s gesturing ability is not based on a specialized language-based feedforward control as originally concluded from previous research, but is still dependent on a varied means of recurrent feedback from the body.Additional information
supplementary tables -
Pouw, W., & Fuchs, S. (2022). Origins of vocal-entangled gesture. Neuroscience and Biobehavioral Reviews, 141: 104836. doi:10.1016/j.neubiorev.2022.104836.
Abstract
Gestures during speaking are typically understood in a representational framework: they represent absent or distal states of affairs by means of pointing, resemblance, or symbolic replacement. However, humans also gesture along with the rhythm of speaking, which is amenable to a non-representational perspective. Such a perspective centers on the phenomenon of vocal-entangled gestures and builds on evidence showing that when an upper limb with a certain mass decelerates/accelerates sufficiently, it yields impulses on the body that cascade in various ways into the respiratory–vocal system. It entails a physical entanglement between body motions, respiration, and vocal activities. It is shown that vocal-entangled gestures are realized in infant vocal–motor babbling before any representational use of gesture develops. Similarly, an overview is given of vocal-entangled processes in non-human animals. They can frequently be found in rats, bats, birds, and a range of other species that developed even earlier in the phylogenetic tree. Thus, the origins of human gesture lie in biomechanics, emerging early in ontogeny and running deep in phylogeny. -
Rasenberg, M., Pouw, W., Özyürek, A., & Dingemanse, M. (2022). The multimodal nature of communicative efficiency in social interaction. Scientific Reports, 12: 19111. doi:10.1038/s41598-022-22883-w.
Abstract
How does communicative efficiency shape language use? We approach this question by studying it at the level of the dyad, and in terms of multimodal utterances. We investigate whether and how people minimize their joint speech and gesture efforts in face-to-face interactions, using linguistic and kinematic analyses. We zoom in on other-initiated repair—a conversational microcosm where people coordinate their utterances to solve problems with perceiving or understanding. We find that efforts in the spoken and gestural modalities are wielded in parallel across repair turns of different types, and that people repair conversational problems in the most cost-efficient way possible, minimizing the joint multimodal effort for the dyad as a whole. These results are in line with the principle of least collaborative effort in speech and with the reduction of joint costs in non-linguistic joint actions. The results extend our understanding of those coefficiency principles by revealing that they pertain to multimodal utterance design.Additional information
Data and analysis scripts -
Brown, A. R., Pouw, W., Brentari, D., & Goldin-Meadow, S. (2021). People are less susceptible to illusion when they use their hands to communicate rather than estimate. Psychological Science, 32, 1227-1237. doi:10.1177/0956797621991552.
Abstract
When we use our hands to estimate the length of a stick in the Müller-Lyer illusion, we are highly susceptible to the illusion. But when we prepare to act on sticks under the same conditions, we are significantly less susceptible. Here, we asked whether people are susceptible to illusion when they use their hands not to act on objects but to describe them in spontaneous co-speech gestures or conventional sign languages of the deaf. Thirty-two English speakers and 13 American Sign Language signers used their hands to act on, estimate the length of, and describe sticks eliciting the Müller-Lyer illusion. For both gesture and sign, the magnitude of illusion in the description task was smaller than the magnitude of illusion in the estimation task and not different from the magnitude of illusion in the action task. The mechanisms responsible for producing gesture in speech and sign thus appear to operate not on percepts involved in estimation but on percepts derived from the way we act on objects. -
Pouw, W., Dingemanse, M., Motamedi, Y., & Ozyurek, A. (2021). A systematic investigation of gesture kinematics in evolving manual languages in the lab. Cognitive Science, 45(7): e13014. doi:10.1111/cogs.13014.
Abstract
Silent gestures consist of complex multi-articulatory movements but are now primarily studied through categorical coding of the referential gesture content. The relation of categorical linguistic content with continuous kinematics is therefore poorly understood. Here, we reanalyzed the video data from a gestural evolution experiment (Motamedi, Schouwstra, Smith, Culbertson, & Kirby, 2019), which showed increases in the systematicity of gesture content over time. We applied computer vision techniques to quantify the kinematics of the original data. Our kinematic analyses demonstrated that gestures become more efficient and less complex in their kinematics over generations of learners. We further detect the systematicity of gesture form on the level of thegesture kinematic interrelations, which directly scales with the systematicity obtained on semantic coding of the gestures. Thus, from continuous kinematics alone, we can tap into linguistic aspects that were previously only approachable through categorical coding of meaning. Finally, going beyond issues of systematicity, we show how unique gesture kinematic dialects emerged over generations as isolated chains of participants gradually diverged over iterations from other chains. We, thereby, conclude that gestures can come to embody the linguistic system at the level of interrelationships between communicative tokens, which should calibrate our theories about form and linguistic content. -
Pouw, W., Wit, J., Bögels, S., Rasenberg, M., Milivojevic, B., & Ozyurek, A. (2021). Semantically related gestures move alike: Towards a distributional semantics of gesture kinematics. In V. G. Duffy (
Ed. ), Digital human modeling and applications in health, safety, ergonomics and risk management. human body, motion and behavior:12th International Conference, DHM 2021, Held as Part of the 23rd HCI International Conference, HCII 2021 (pp. 269-287). Berlin: Springer. doi:10.1007/978-3-030-77817-0_20. -
Pouw, W., Proksch, S., Drijvers, L., Gamba, M., Holler, J., Kello, C., Schaefer, R. S., & Wiggins, G. A. (2021). Multilevel rhythms in multimodal communication. Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences, 376: 20200334. doi:10.1098/rstb.2020.0334.
Abstract
It is now widely accepted that the brunt of animal communication is conducted via several modalities, e.g. acoustic and visual, either simultaneously or sequentially. This is a laudable multimodal turn relative to traditional accounts of temporal aspects of animal communication which have focused on a single modality at a time. However, the fields that are currently contributing to the study of multimodal communication are highly varied, and still largely disconnected given their sole focus on a particular level of description or their particular concern with human or non-human animals. Here, we provide an integrative overview of converging findings that show how multimodal processes occurring at neural, bodily, as well as social interactional levels each contribute uniquely to the complex rhythms that characterize communication in human and non-human animals. Though we address findings for each of these levels independently, we conclude that the most important challenge in this field is to identify how processes at these different levels connect. -
Pouw, W., De Jonge-Hoekstra, L., Harrison, S. J., Paxton, A., & Dixon, J. A. (2021). Gesture-speech physics in fluent speech and rhythmic upper limb movements. Annals of the New York Academy of Sciences, 1491(1), 89-105. doi:10.1111/nyas.14532.
Abstract
Communicative hand gestures are often coordinated with prosodic aspects of speech, and salient moments of gestural movement (e.g., quick changes in speed) often co-occur with salient moments in speech (e.g., near peaks in fundamental frequency and intensity). A common understanding is that such gesture and speech coordination is culturally and cognitively acquired, rather than having a biological basis. Recently, however, the biomechanical physical coupling of arm movements to speech movements has been identified as a potentially important factor in understanding the emergence of gesture-speech coordination. Specifically, in the case of steady-state vocalization and mono-syllable utterances, forces produced during gesturing are transferred onto the tensioned body, leading to changes in respiratory-related activity and thereby affecting vocalization F0 and intensity. In the current experiment (N = 37), we extend this previous line of work to show that gesture-speech physics impacts fluent speech, too. Compared with non-movement, participants who are producing fluent self-formulated speech, while rhythmically moving their limbs, demonstrate heightened F0 and amplitude envelope, and such effects are more pronounced for higher-impulse arm versus lower-impulse wrist movement. We replicate that acoustic peaks arise especially during moments of peak-impulse (i.e., the beat) of the movement, namely around deceleration phases of the movement. Finally, higher deceleration rates of higher-mass arm movements were related to higher peaks in acoustics. These results confirm a role for physical-impulses of gesture affecting the speech system. We discuss the implications of
gesture-speech physics for understanding of the emergence of communicative gesture, both ontogenetically and phylogenetically.Additional information
data and analyses -
Kamermans, K. L., Pouw, W., Mast, F. W., & Paas, F. (2019). Reinterpretation in visual imagery is possible without visual cues: A validation of previous research. Psychological Research, 83(6), 1237-1250. doi:10.1007/s00426-017-0956-5.
Abstract
Is visual reinterpretation of bistable figures (e.g., duck/rabbit figure) in visual imagery possible? Current consensus suggests that it is in principle possible because of converging evidence of quasi-pictorial functioning of visual imagery. Yet, studies that have directly tested and found evidence for reinterpretation in visual imagery, allow for the possibility that reinterpretation was already achieved during memorization of the figure(s). One study resolved this issue, providing evidence for reinterpretation in visual imagery (Mast and Kosslyn, Cognition 86:57-70, 2002). However, participants in that study performed reinterpretations with aid of visual cues. Hence, reinterpretation was not performed with mental imagery alone. Therefore, in this study we assessed the possibility of reinterpretation without visual support. We further explored the possible role of haptic cues to assess the multimodal nature of mental imagery. Fifty-three participants were consecutively presented three to be remembered bistable 2-D figures (reinterpretable when rotated 180 degrees), two of which were visually inspected and one was explored hapticly. After memorization of the figures, a visually bistable exemplar figure was presented to ensure understanding of the concept of visual bistability. During recall, 11 participants (out of 36; 30.6%) who did not spot bistability during memorization successfully performed reinterpretations when instructed to mentally rotate their visual image, but additional haptic cues during mental imagery did not inflate reinterpretation ability. This study validates previous findings that reinterpretation in visual imagery is possible. -
Kamermans, K. L., Pouw, W., Fassi, L., Aslanidou, A., Paas, F., & Hostetter, A. B. (2019). The role of gesture as simulated action in reinterpretation of mental imagery. Acta Psychologica, 197, 131-142. doi:10.1016/j.actpsy.2019.05.004.
Abstract
In two experiments, we examined the role of gesture in reinterpreting a mental image. In Experiment 1, we found that participants gestured more about a figure they had learned through manual exploration than about a figure they had learned through vision. This supports claims that gestures emerge from the activation of perception-relevant actions during mental imagery. In Experiment 2, we investigated whether such gestures have a causal role in affecting the quality of mental imagery. Participants were randomly assigned to gesture, not gesture, or engage in a manual interference task as they attempted to reinterpret a figure they had learned through manual exploration. We found that manual interference significantly impaired participants' success on the task. Taken together, these results suggest that gestures reflect mental imaginings of interactions with a mental image and that these imaginings are critically important for mental manipulation and reinterpretation of that image. However, our results suggest that enacting the imagined movements in gesture is not critically important on this particular task. -
Pouw, W., Paxton, A., Harrison, S. J., & Dixon, J. A. (2019). Acoustic specification of upper limb movement in voicing. In A. Grimminger (
Ed. ), Proceedings of the 6th Gesture and Speech in Interaction – GESPIN 6 (pp. 68-74). Paderborn: Universitaetsbibliothek Paderborn. doi:10.17619/UNIPB/1-812.Additional information
https://osf.io/9843h/ -
Pouw, W., & Dixon, J. A. (2019). Entrainment and modulation of gesture-speech synchrony under delayed auditory feedback. Cognitive Science, 43(3): e12721. doi:10.1111/cogs.12721.
Abstract
Gesture–speech synchrony re-stabilizes when hand movement or speech is disrupted by a delayed
feedback manipulation, suggesting strong bidirectional coupling between gesture and speech. Yet it
has also been argued from case studies in perceptual–motor pathology that hand gestures are a special
kind of action that does not require closed-loop re-afferent feedback to maintain synchrony with
speech. In the current pre-registered within-subject study, we used motion tracking to conceptually
replicate McNeill’s (1992) classic study on gesture–speech synchrony under normal and 150 ms
delayed auditory feedback of speech conditions (NO DAF vs. DAF). Consistent with, and extending
McNeill’s original results, we obtain evidence that (a) gesture-speech synchrony is more stable
under DAF versus NO DAF (i.e., increased coupling effect), (b) that gesture and speech variably
entrain to the external auditory delay as indicated by a consistent shift in gesture-speech synchrony
offsets (i.e., entrainment effect), and (c) that the coupling effect and the entrainment effect are codependent.
We suggest, therefore, that gesture–speech synchrony provides a way for the cognitive
system to stabilize rhythmic activity under interfering conditions.Additional information
https://osf.io/pcde3/ -
Pouw, W., & Dixon, J. A. (2019). Quantifying gesture-speech synchrony. In A. Grimminger (
Ed. ), Proceedings of the 6th Gesture and Speech in Interaction – GESPIN 6 (pp. 75-80). Paderborn: Universitaetsbibliothek Paderborn. doi:10.17619/UNIPB/1-812.Abstract
Spontaneously occurring speech is often seamlessly accompanied by hand gestures. Detailed
observations of video data suggest that speech and gesture are tightly synchronized in time,
consistent with a dynamic interplay between body and mind. However, spontaneous gesturespeech
synchrony has rarely been objectively quantified beyond analyses of video data, which
do not allow for identification of kinematic properties of gestures. Consequently, the point in
gesture which is held to couple with speech, the so-called moment of “maximum effort”, has
been variably equated with the peak velocity, peak acceleration, peak deceleration, or the onset
of the gesture. In the current exploratory report, we provide novel evidence from motiontracking
and acoustic data that peak velocity is closely aligned, and shortly leads, the peak pitch
(F0) of speechAdditional information
https://osf.io/9843h/ -
Pouw, W., Rop, G., De Koning, B., & Paas, F. (2019). The cognitive basis for the split-attention effect. Journal of Experimental Psychology: General, 148(11), 2058-2075. doi:10.1037/xge0000578.
Abstract
The split-attention effect entails that learning from spatially separated, but mutually referring information
sources (e.g., text and picture), is less effective than learning from the equivalent spatially integrated
sources. According to cognitive load theory, impaired learning is caused by the working memory load
imposed by the need to distribute attention between the information sources and mentally integrate them.
In this study, we directly tested whether the split-attention effect is caused by spatial separation per se.
Spatial distance was varied in basic cognitive tasks involving pictures (Experiment 1) and text–picture
combinations (Experiment 2; preregistered study), and in more ecologically valid learning materials
(Experiment 3). Experiment 1 showed that having to integrate two pictorial stimuli at greater distances
diminished performance on a secondary visual working memory task, but did not lead to slower
integration. When participants had to integrate a picture and written text in Experiment 2, a greater
distance led to slower integration of the stimuli, but not to diminished performance on the secondary task.
Experiment 3 showed that presenting spatially separated (compared with integrated) textual and pictorial
information yielded fewer integrative eye movements, but this was not further exacerbated when
increasing spatial distance even further. This effect on learning processes did not lead to differences in
learning outcomes between conditions. In conclusion, we provide evidence that larger distances between
spatially separated information sources influence learning processes, but that spatial separation on its
own is not likely to be the only, nor a sufficient, condition for impacting learning outcomes.Files private
Request files
Share this page