Displaying 1 - 3 of 3
-
Loke*, J., Seijdel*, N., Snoek, L., Sorensen, L., Van de Klundert, R., Van der Meer, M., Quispel, E., Cappaert, N., & Scholte, H. S. (2024). Human visual cortex and deep convolutional neural network care deeply about object background. Journal of Cognitive Neuroscience, 36(3), 551-566. doi:10.1162/jocn_a_02098.
Abstract
* These authors contributed equally/shared first author
Deep convolutional neural networks (DCNNs) are able to partially predict brain activity during object categorization tasks, but factors contributing to this predictive power are not fully understood. Our study aimed to investigate the factors contributing to the predictive power of DCNNs in object categorization tasks. We compared the activity of four DCNN architectures with EEG recordings obtained from 62 human participants during an object categorization task. Previous physiological studies on object categorization have highlighted the importance of figure-ground segregation—the ability to distinguish objects from their backgrounds. Therefore, we investigated whether figure-ground segregation could explain the predictive power of DCNNs. Using a stimulus set consisting of identical target objects embedded in different backgrounds, we examined the influence of object background versus object category within both EEG and DCNN activity. Crucially, the recombination of naturalistic objects and experimentally controlled backgrounds creates a challenging and naturalistic task, while retaining experimental control. Our results showed that early EEG activity (< 100 msec) and early DCNN layers represent object background rather than object category. We also found that the ability of DCNNs to predict EEG activity is primarily influenced by how both systems process object backgrounds, rather than object categories. We demonstrated the role of figure-ground segregation as a potential prerequisite for recognition of object features, by contrasting the activations of trained and untrained (i.e., random weights) DCNNs. These findings suggest that both human visual cortex and DCNNs prioritize the segregation of object backgrounds and target objects to perform object categorization. Altogether, our study provides new insights into the mechanisms underlying object categorization as we demonstrated that both human visual cortex and DCNNs care deeply about object background.Additional information
link to preprint -
Seijdel, N., Schoffelen, J.-M., Hagoort, P., & Drijvers, L. (2024). Attention drives visual processing and audiovisual integration during multimodal communication. The Journal of Neuroscience, 44(10): e0870232023. doi:10.1523/JNEUROSCI.0870-23.2023.
Abstract
During communication in real-life settings, our brain often needs to integrate auditory and visual information, and at the same time actively focus on the relevant sources of information, while ignoring interference from irrelevant events. The interaction between integration and attention processes remains poorly understood. Here, we use rapid invisible frequency tagging (RIFT) and magnetoencephalography (MEG) to investigate how attention affects auditory and visual information processing and integration, during multimodal communication. We presented human participants (male and female) with videos of an actress uttering action verbs (auditory; tagged at 58 Hz) accompanied by two movie clips of hand gestures on both sides of fixation (attended stimulus tagged at 65 Hz; unattended stimulus tagged at 63 Hz). Integration difficulty was manipulated by a lower-order auditory factor (clear/degraded speech) and a higher-order visual semantic factor (matching/mismatching gesture). We observed an enhanced neural response to the attended visual information during degraded speech compared to clear speech. For the unattended information, the neural response to mismatching gestures was enhanced compared to matching gestures. Furthermore, signal power at the intermodulation frequencies of the frequency tags, indexing non-linear signal interactions, was enhanced in left frontotemporal and frontal regions. Focusing on LIFG (Left Inferior Frontal Gyrus), this enhancement was specific for the attended information, for those trials that benefitted from integration with a matching gesture. Together, our results suggest that attention modulates audiovisual processing and interaction, depending on the congruence and quality of the sensory input.Additional information
link to preprint -
Loke, J., Seijdel, N., Snoek, L., Van der Meer, M., Van de Klundert, R., Quispel, E., Cappaert, N., & Scholte, H. S. (2022). A critical test of deep convolutional neural networks’ ability to capture recurrent processing in the brain using visual masking. Journal of Cognitive Neuroscience, 34(12): 10.1101/2022.01.30.478404, pp. 2390-2405. doi:10.1162/jocn_a_01914.
Abstract
Recurrent processing is a crucial feature in human visual processing supporting perceptual grouping, figure-ground segmentation, and recognition under challenging conditions. There is a clear need to incorporate recurrent processing in deep convolutional neural networks (DCNNs) but the computations underlying recurrent processing remain unclear. In this paper, we tested a form of recurrence in deep residual networks (ResNets) to capture recurrent processing signals in the human brain. Though ResNets are feedforward networks, they approximate an excitatory additive form of recurrence. Essentially, this form of recurrence consists of repeating excitatory activations in response to a static stimulus. Here, we used ResNets of varying depths (reflecting varying levels of recurrent processing) to explain electroencephalography (EEG) activity within a visual masking paradigm. Sixty-two humans and fifty artificial agents (10 ResNet models of depths - 4, 6, 10, 18 and 34) completed an object categorization task. We show that deeper networks (ResNet-10, 18 and 34) explained more variance in brain activity compared to shallower networks (ResNet-4 and 6). Furthermore, all ResNets captured differences in brain activity between unmasked and masked trials, with differences starting at ∼98ms (from stimulus onset). These early differences indicated that EEG activity reflected ‘pure’ feedforward signals only briefly (up to ∼98ms). After ∼98ms, deeper networks showed a significant increase in explained variance which peaks at ∼200ms, but only within unmasked trials, not masked trials. In summary, we provided clear evidence that excitatory additive recurrent processing in ResNets captures some of the recurrent processing in humans.
Share this page