Displaying 1 - 15 of 15
-
Bosker, H. R., Peeters, D., & Holler, J. (2020). How visual cues to speech rate influence speech perception. Quarterly Journal of Experimental Psychology, 73(10), 1523-1536. doi:10.1177/1747021820914564.
Abstract
Spoken words are highly variable and therefore listeners interpret speech sounds relative to the surrounding acoustic context, such as the speech rate of a preceding sentence. For instance, a vowel midway between short /ɑ/ and long /a:/ in Dutch is perceived as short /ɑ/ in the context of preceding slow speech, but as long /a:/ if preceded by a fast context. Despite the well-established influence of visual articulatory cues on speech comprehension, it remains unclear whether visual cues to speech rate also influence subsequent spoken word recognition. In two ‘Go Fish’-like experiments, participants were presented with audio-only (auditory speech + fixation cross), visual-only (mute videos of talking head), and audiovisual (speech + videos) context sentences, followed by ambiguous target words containing vowels midway between short /ɑ/ and long /a:/. In Experiment 1, target words were always presented auditorily, without visual articulatory cues. Although the audio-only and audiovisual contexts induced a rate effect (i.e., more long /a:/ responses after fast contexts), the visual-only condition did not. When, in Experiment 2, target words were presented audiovisually, rate effects were observed in all three conditions, including visual-only. This suggests that visual cues to speech rate in a context sentence influence the perception of following visual target cues (e.g., duration of lip aperture), which at an audiovisual integration stage bias participants’ target categorization responses. These findings contribute to a better understanding of how what we see influences what we hear. -
Macuch Silva, V., Holler, J., Ozyurek, A., & Roberts, S. G. (2020). Multimodality and the origin of a novel communication system in face-to-face interaction. Royal Society Open Science, 7: 182056. doi:10.1098/rsos.182056.
Abstract
Face-to-face communication is multimodal at its core: it consists of a combination of vocal and visual signalling. However, current evidence suggests that, in the absence of an established communication system, visual signalling, especially in the form of visible gesture, is a more powerful form of communication than vocalisation, and therefore likely to have played a primary role in the emergence of human language. This argument is based on experimental evidence of how vocal and visual modalities (i.e., gesture) are employed to communicate about familiar concepts when participants cannot use their existing languages. To investigate this further, we introduce an experiment where pairs of participants performed a referential communication task in which they described unfamiliar stimuli in order to reduce reliance on conventional signals. Visual and auditory stimuli were described in three conditions: using visible gestures only, using non-linguistic vocalisations only and given the option to use both (multimodal communication). The results suggest that even in the absence of conventional signals, gesture is a more powerful mode of communication compared to vocalisation, but that there are also advantages to multimodality compared to using gesture alone. Participants with an option to produce multimodal signals had comparable accuracy to those using only gesture, but gained an efficiency advantage. The analysis of the interactions between participants showed that interactants developed novel communication systems for unfamiliar stimuli by deploying different modalities flexibly to suit their needs and by taking advantage of multimodality when required. -
Ripperda, J., Drijvers, L., & Holler, J. (2020). Speeding up the detection of non-iconic and iconic gestures (SPUDNIG): A toolkit for the automatic detection of hand movements and gestures in video data. Behavior Research Methods, 52(4), 1783-1794. doi:10.3758/s13428-020-01350-2.
Abstract
In human face-to-face communication, speech is frequently accompanied by visual signals, especially communicative hand gestures. Analyzing these visual signals requires detailed manual annotation of video data, which is often a labor-intensive and time-consuming process. To facilitate this process, we here present SPUDNIG (SPeeding Up the Detection of Non-iconic and Iconic Gestures), a tool to automatize the detection and annotation of hand movements in video data. We provide a detailed description of how SPUDNIG detects hand movement initiation and termination, as well as open-source code and a short tutorial on an easy-to-use graphical user interface (GUI) of our tool. We then provide a proof-of-principle and validation of our method by comparing SPUDNIG’s output to manual annotations of gestures by a human coder. While the tool does not entirely eliminate the need of a human coder (e.g., for false positives detection), our results demonstrate that SPUDNIG can detect both iconic and non-iconic gestures with very high accuracy, and could successfully detect all iconic gestures in our validation dataset. Importantly, SPUDNIG’s output can directly be imported into commonly used annotation tools such as ELAN and ANVIL. We therefore believe that SPUDNIG will be highly relevant for researchers studying multimodal communication due to its annotations significantly accelerating the analysis of large video corpora.Additional information
data and materials -
Sekine, K., Schoechl, C., Mulder, K., Holler, J., Kelly, S., Furman, R., & Ozyurek, A. (2020). Evidence for children's online integration of simultaneous information from speech and iconic gestures: An ERP study. Language, Cognition and Neuroscience, 35(10), 1283-1294. doi:10.1080/23273798.2020.1737719.
Abstract
Children perceive iconic gestures, along with speech they hear. Previous studies have shown
that children integrate information from both modalities. Yet it is not known whether children
can integrate both types of information simultaneously as soon as they are available as adults
do or processes them separately initially and integrate them later. Using electrophysiological
measures, we examined the online neurocognitive processing of gesture-speech integration in
6- to 7-year-old children. We focused on the N400 event-related potentials component which
is modulated by semantic integration load. Children watched video clips of matching or
mismatching gesture-speech combinations, which varied the semantic integration load. The
ERPs showed that the amplitude of the N400 was larger in the mismatching condition than in
the matching condition. This finding provides the first neural evidence that by the ages of 6
or 7, children integrate multimodal semantic information in an online fashion comparable to
that of adults. -
Ter Bekke, M., Drijvers, L., & Holler, J. (2020). The predictive potential of hand gestures during conversation: An investigation of the timing of gestures in relation to speech. In Proceedings of the 7th GESPIN - Gesture and Speech in Interaction Conference. Stockholm: KTH Royal Institute of Technology.
Abstract
In face-to-face conversation, recipients might use the bodily movements of the speaker (e.g. gestures) to facilitate language processing. It has been suggested that one way through which this facilitation may happen is prediction. However, for this to be possible, gestures would need to precede speech, and it is unclear whether this is true during natural conversation.
In a corpus of Dutch conversations, we annotated hand gestures that represent semantic information and occurred during questions, and the word(s) which corresponded most closely to the gesturally depicted meaning. Thus, we tested whether representational gestures temporally precede their lexical affiliates. Further, to see whether preceding gestures may indeed facilitate language processing, we asked whether the gesture-speech asynchrony predicts the response time to the question the gesture is part of.
Gestures and their strokes (most meaningful movement component) indeed preceded the corresponding lexical information, thus demonstrating their predictive potential. However, while questions with gestures got faster responses than questions without, there was no evidence that questions with larger gesture-speech asynchronies get faster responses. These results suggest that gestures indeed have the potential to facilitate predictive language processing, but further analyses on larger datasets are needed to test for links between asynchrony and processing advantages. -
Holler, J., Kendrick, K. H., Casillas, M., & Levinson, S. C. (
Eds. ). (2016). Turn-Taking in Human Communicative Interaction. Lausanne: Frontiers Media. doi:10.3389/978-2-88919-825-2.Abstract
The core use of language is in face-to-face conversation. This is characterized by rapid turn-taking. This turn-taking poses a number central puzzles for the psychology of language.
Consider, for example, that in large corpora the gap between turns is on the order of 100 to 300 ms, but the latencies involved in language production require minimally between 600ms (for a single word) or 1500 ms (for as simple sentence). This implies that participants in conversation are predicting the ends of the incoming turn and preparing in advance. But how is this done? What aspects of this prediction are done when? What happens when the prediction is wrong? What stops participants coming in too early? If the system is running on prediction, why is there consistently a mode of 100 to 300 ms in response time?
The timing puzzle raises further puzzles: it seems that comprehension must run parallel with the preparation for production, but it has been presumed that there are strict cognitive limitations on more than one central process running at a time. How is this bottleneck overcome? Far from being 'easy' as some psychologists have suggested, conversation may be one of the most demanding cognitive tasks in our everyday lives. Further questions naturally arise: how do children learn to master this demanding task, and what is the developmental trajectory in this domain?
Research shows that aspects of turn-taking such as its timing are remarkably stable across languages and cultures, but the word order of languages varies enormously. How then does prediction of the incoming turn work when the verb (often the informational nugget in a clause) is at the end? Conversely, how can production work fast enough in languages that have the verb at the beginning, thereby requiring early planning of the whole clause? What happens when one changes modality, as in sign languages -- with the loss of channel constraints is turn-taking much freer? And what about face-to-face communication amongst hearing individuals -- do gestures, gaze, and other body behaviors facilitate turn-taking? One can also ask the phylogenetic question: how did such a system evolve? There seem to be parallels (analogies) in duetting bird species, and in a variety of monkey species, but there is little evidence of anything like this among the great apes.
All this constitutes a neglected set of problems at the heart of the psychology of language and of the language sciences. This research topic welcomes contributions from right across the board, for example from psycholinguists, developmental psychologists, students of dialogue and conversation analysis, linguists interested in the use of language, phoneticians, corpus analysts and comparative ethologists or psychologists. We welcome contributions of all sorts, for example original research papers, opinion pieces, and reviews of work in subfields that may not be fully understood in other subfields. -
Humphries, S., Holler, J., Crawford, T. J., Herrera, E., & Poliakoff, E. (2016). A third-person perspective on co-speech action gestures in Parkinson’s disease. Cortex, 78, 44-54. doi:10.1016/j.cortex.2016.02.009.
Abstract
A combination of impaired motor and cognitive function in Parkinson’s disease (PD) can impact on language and communication, with patients exhibiting a particular difficulty processing action verbs. Co-speech gestures embody a link between action and language and contribute significantly to communication in healthy people. Here, we investigated how co-speech gestures depicting actions are affected in PD, in particular with respect to the visual perspective—or the viewpoint – they depict. Gestures are closely related to mental imagery and motor simulations, but people with PD may be impaired in the way they simulate actions from a first-person perspective and may compensate for this by relying more on third-person visual features. We analysed the action-depicting gestures produced by mild-moderate PD patients and age-matched controls on an action description task and examined the relationship between gesture viewpoint, action naming, and performance on an action observation task (weight judgement). Healthy controls produced the majority of their action gestures from a first-person perspective, whereas PD patients produced a greater proportion of gestures produced from a third-person perspective. We propose that this reflects a compensatory reliance on third-person visual features in the simulation of actions in PD. Performance was also impaired in action naming and weight judgement, although this was unrelated to gesture viewpoint. Our findings provide a more comprehensive understanding of how action-language impairments in PD impact on action communication, on the cognitive underpinnings of this impairment, as well as elucidating the role of action simulation in gesture production -
Rowbotham, S. J., Holler, J., Wearden, A., & Lloyd, D. M. (2016). I see how you feel: Recipients obtain additional information from speakers’ gestures about pain. Patient Education and Counseling, 99(8), 1333-1342. doi:10.1016/j.pec.2016.03.007.
Abstract
Objective
Despite the need for effective pain communication, pain is difficult to verbalise. Co-speech gestures frequently add information about pain that is not contained in the accompanying speech. We explored whether recipients can obtain additional information from gestures about the pain that is being described.
Methods
Participants (n = 135) viewed clips of pain descriptions under one of four conditions: 1) Speech Only; 2) Speech and Gesture; 3) Speech, Gesture and Face; and 4) Speech, Gesture and Face plus Instruction (short presentation explaining the pain information that gestures can depict). Participants provided free-text descriptions of the pain that had been described. Responses were scored for the amount of information obtained from the original clips.
Findings
Participants in the Instruction condition obtained the most information, while those in the Speech Only condition obtained the least (all comparisons p<.001).
Conclusions
Gestures produced during pain descriptions provide additional information about pain that recipients are able to pick up without detriment to their uptake of spoken information.
Practice implications
Healthcare professionals may benefit from instruction in gestures to enhance uptake of information about patients’ pain experiences. -
Cai, Z. G., Conell, L., & Holler, J. (2013). Time does not flow without language: Spatial distance affects temporal duration regardless of movement or direction. Psychonomic Bulletin & Review, 20(5), 973-980. doi:10.3758/s13423-013-0414-3.
Abstract
Much evidence has suggested that people conceive of time as flowing directionally in transverse space (e.g., from left to right for English speakers). However, this phenomenon has never been tested in a fully nonlinguistic paradigm where neither stimuli nor task use linguistic labels, which raises the possibility that time is directional only when reading/writing direction has been evoked. In the present study, English-speaking participants viewed a video where an actor sang a note while gesturing and reproduced the duration of the sung note by pressing a button. Results showed that the perceived duration of the note was increased by a long-distance gesture, relative to a short-distance gesture. This effect was equally strong for gestures moving from left to right and from right to left and was not dependent on gestures depicting movement through space; a weaker version of the effect emerged with static gestures depicting spatial distance. Since both our gesture stimuli and temporal reproduction task were nonlinguistic, we conclude that the spatial representation of time is nondirectional: Movement contributes, but is not necessary, to the representation of temporal information in a transverse timeline. -
Connell, L., Cai, Z. G., & Holler, J. (2013). Do you see what I'm singing? Visuospatial movement biases pitch perception. Brain and Cognition, 81, 124-130. doi:10.1016/j.bandc.2012.09.005.
Abstract
The nature of the connection between musical and spatial processing is controversial. While pitch may be described in spatial terms such as “high” or “low”, it is unclear whether pitch and space are associated but separate dimensions or whether they share representational and processing resources. In the present study, we asked participants to judge whether a target vocal note was the same as (or different from) a preceding cue note. Importantly, target trials were presented as video clips where a singer sometimes gestured upward or downward while singing that target note, thus providing an alternative, concurrent source of spatial information. Our results show that pitch discrimination was significantly biased by the spatial movement in gesture, such that downward gestures made notes seem lower in pitch than they really were, and upward gestures made notes seem higher in pitch. These effects were eliminated by spatial memory load but preserved under verbal memory load conditions. Together, our findings suggest that pitch and space have a shared representation such that the mental representation of pitch is audiospatial in nature. -
Hall, S., Rumney, L., Holler, J., & Kidd, E. (2013). Associations among play, gesture and early spoken language acquisition. First Language, 33, 294-312. doi:10.1177/0142723713487618.
Abstract
The present study investigated the developmental interrelationships between play, gesture use and spoken language development in children aged 18–31 months. The children completed two tasks: (i) a structured measure of pretend (or ‘symbolic’) play and (ii) a measure of vocabulary knowledge in which children have been shown to gesture. Additionally, their productive spoken language knowledge was measured via parental report. The results indicated that symbolic play is positively associated with children’s gesture use, which in turn is positively associated with spoken language knowledge over and above the influence of age. The tripartite relationship between gesture, play and language development is discussed with reference to current developmental theory. -
Holler, J., Schubotz, L., Kelly, S., Schuetze, M., Hagoort, P., & Ozyurek, A. (2013). Here's not looking at you, kid! Unaddressed recipients benefit from co-speech gestures when speech processing suffers. In M. Knauff, M. Pauen, I. Sebanz, & I. Wachsmuth (
Eds. ), Proceedings of the 35th Annual Meeting of the Cognitive Science Society (CogSci 2013) (pp. 2560-2565). Austin, TX: Cognitive Science Society. Retrieved from http://mindmodeling.org/cogsci2013/papers/0463/index.html.Abstract
In human face-to-face communication, language comprehension is a multi-modal, situated activity. However, little is known about how we combine information from these different modalities, and how perceived communicative intentions, often signaled through visual signals, such as eye
gaze, may influence this processing. We address this question by simulating a triadic communication context in which a
speaker alternated her gaze between two different recipients. Participants thus viewed speech-only or speech+gesture
object-related utterances when being addressed (direct gaze) or unaddressed (averted gaze). Two object images followed
each message and participants’ task was to choose the object that matched the message. Unaddressed recipients responded significantly slower than addressees for speech-only
utterances. However, perceiving the same speech accompanied by gestures sped them up to a level identical to
that of addressees. That is, when speech processing suffers due to not being addressed, gesture processing remains intact and enhances the comprehension of a speaker’s message -
Holler, J., Turner, K., & Varcianna, T. (2013). It's on the tip of my fingers: Co-speech gestures during lexical retrieval in different social contexts. Language and Cognitive Processes, 28(10), 1509-1518. doi:10.1080/01690965.2012.698289.
Abstract
The Lexical Retrieval Hypothesis proposes that gestures function at the level of speech production, aiding in the retrieval of lexical items from the mental lexicon. However, empirical evidence for this account is mixed, and some critics argue that a more likely function of gestures during lexical retrieval is a communicative one. The present study was designed to test these predictions against each other by keeping lexical retrieval difficulty constant while varying social context. Participants' gestures were analysed during tip of the tongue experiences when communicating with a partner face-to-face (FTF), while being separated by a screen, or on their own by speaking into a voice recorder. The results show that participants in the FTF context produced significantly more representational gestures than participants in the solitary condition. This suggests that, even in the specific context of lexical retrieval difficulties, representational gestures appear to play predominantly a communicative role.Files private
Request files -
Lynott, D., Connell, L., & Holler, J. (
Eds. ). (2013). The role of body and environment in cognition. Frontiers in Psychology, 4: 465. doi:10.3389/fpsyg.2013.00465. -
Peeters, D., Chu, M., Holler, J., Ozyurek, A., & Hagoort, P. (2013). Getting to the point: The influence of communicative intent on the kinematics of pointing gestures. In M. Knauff, M. Pauen, N. Sebanz, & I. Wachsmuth (
Eds. ), Proceedings of the 35th Annual Meeting of the Cognitive Science Society (CogSci 2013) (pp. 1127-1132). Austin, TX: Cognitive Science Society.Abstract
In everyday communication, people not only use speech but
also hand gestures to convey information. One intriguing
question in gesture research has been why gestures take the
specific form they do. Previous research has identified the
speaker-gesturer’s communicative intent as one factor
shaping the form of iconic gestures. Here we investigate
whether communicative intent also shapes the form of
pointing gestures. In an experimental setting, twenty-four
participants produced pointing gestures identifying a referent
for an addressee. The communicative intent of the speakergesturer
was manipulated by varying the informativeness of
the pointing gesture. A second independent variable was the
presence or absence of concurrent speech. As a function of their communicative intent and irrespective of the presence of speech, participants varied the durations of the stroke and the post-stroke hold-phase of their gesture. These findings add to our understanding of how the communicative context influences the form that a gesture takes.Additional information
http://mindmodeling.org/cogsci2013/papers/0219/index.html
Share this page