Spoken words can make the invisible visible: Testing the involvement of low-level visual representations in spoken word processing

Ostarek, M., & Huettig, F. (2016). Spoken words can make the invisible visible: Testing the involvement of low-level visual representations in spoken word processing. Poster presented at the Eighth Annual Meeting of the Society for the Neurobiology of Language (SNL 2016), London, UK.
The notion that processing spoken (object) words involves activation of category-specific representations in visual cortex is a key prediction of modality-specific theories of representation that contrasts with theories assuming dedicated conceptual representational systems abstracted away from sensorimotor systems. Although some neuroimaging evidence is consistent with such a prediction (Desai et al., 2009; Hwang et al., 2009; Lewis & Poeppel, 2014), these findings do not tell us much about the nature of the representations that were accessed. In the present study, we directly tested whether low-level visual cortex is involved in spoken word processing. Using continuous flash suppression we show that spoken words activate behaviorally relevant low-level visual representations and pin down the time-course of this effect to the first hundreds of milliseconds after word onset. We investigated whether participants (N=24) can detect otherwise invisible objects (presented for 400ms) when they are presented with the corresponding spoken word 200ms before the picture appears. We implemented a design in which all cue words appeared equally often in picture-present (50%) and picture-absent trials (50%). In half of the picture-present trials, the spoken word was congruent with the target picture ("bottle" -> picture of a bottle), while on the other half it was incongruent ("bottle" -> picture of a banana). All picture stimuli were evenly distributed over the experimental conditions to rule out low-level differences that can affect detectability regardless of the prime words. Our results showed facilitated detection for congruent vs. incongruent pictures in terms of hit rates (z=-2.33, p=0.02) and d'-scores (t=3.01, p<0.01). A second experiment (N=33) investigated the time-course of the effect by manipulating the timing of picture presentation relative to word onset and revealed that it arises as soon as 200-400ms after word onset and decays at around word offset. Together, these data strongly suggest that spoken words can rapidly activate low-level category-specific visual representations that affect the mere detection of a stimulus, i.e. what we see. More generally our findings fit best with the notion that spoken words activate modality-specific visual representations that are low-level enough to provide information related to a given token and at the same time abstract enough to be relevant not only for previously seen tokens (a signature of episodic memory) but also for generalizing to novel exemplars one has never seen before.
Publication type
Publication date

Share this page