Publications

Displaying 1 - 12 of 12
  • Loke*, J., Seijdel*, N., Snoek, L., Sorensen, L., Van de Klundert, R., Van der Meer, M., Quispel, E., Cappaert, N., & Scholte, H. S. (2024). Human visual cortex and deep convolutional neural network care deeply about object background. Journal of Cognitive Neuroscience, 36(3), 551-566. doi:10.1162/jocn_a_02098.

    Abstract

    * These authors contributed equally/shared first author
    Deep convolutional neural networks (DCNNs) are able to partially predict brain activity during object categorization tasks, but factors contributing to this predictive power are not fully understood. Our study aimed to investigate the factors contributing to the predictive power of DCNNs in object categorization tasks. We compared the activity of four DCNN architectures with EEG recordings obtained from 62 human participants during an object categorization task. Previous physiological studies on object categorization have highlighted the importance of figure-ground segregation—the ability to distinguish objects from their backgrounds. Therefore, we investigated whether figure-ground segregation could explain the predictive power of DCNNs. Using a stimulus set consisting of identical target objects embedded in different backgrounds, we examined the influence of object background versus object category within both EEG and DCNN activity. Crucially, the recombination of naturalistic objects and experimentally controlled backgrounds creates a challenging and naturalistic task, while retaining experimental control. Our results showed that early EEG activity (< 100 msec) and early DCNN layers represent object background rather than object category. We also found that the ability of DCNNs to predict EEG activity is primarily influenced by how both systems process object backgrounds, rather than object categories. We demonstrated the role of figure-ground segregation as a potential prerequisite for recognition of object features, by contrasting the activations of trained and untrained (i.e., random weights) DCNNs. These findings suggest that both human visual cortex and DCNNs prioritize the segregation of object backgrounds and target objects to perform object categorization. Altogether, our study provides new insights into the mechanisms underlying object categorization as we demonstrated that both human visual cortex and DCNNs care deeply about object background.

    Additional information

    link to preprint
  • Seijdel, N., Schoffelen, J.-M., Hagoort, P., & Drijvers, L. (2024). Attention drives visual processing and audiovisual integration during multimodal communication. The Journal of Neuroscience, 44(10): e0870232023. doi:10.1523/JNEUROSCI.0870-23.2023.

    Abstract

    During communication in real-life settings, our brain often needs to integrate auditory and visual information, and at the same time actively focus on the relevant sources of information, while ignoring interference from irrelevant events. The interaction between integration and attention processes remains poorly understood. Here, we use rapid invisible frequency tagging (RIFT) and magnetoencephalography (MEG) to investigate how attention affects auditory and visual information processing and integration, during multimodal communication. We presented human participants (male and female) with videos of an actress uttering action verbs (auditory; tagged at 58 Hz) accompanied by two movie clips of hand gestures on both sides of fixation (attended stimulus tagged at 65 Hz; unattended stimulus tagged at 63 Hz). Integration difficulty was manipulated by a lower-order auditory factor (clear/degraded speech) and a higher-order visual semantic factor (matching/mismatching gesture). We observed an enhanced neural response to the attended visual information during degraded speech compared to clear speech. For the unattended information, the neural response to mismatching gestures was enhanced compared to matching gestures. Furthermore, signal power at the intermodulation frequencies of the frequency tags, indexing non-linear signal interactions, was enhanced in left frontotemporal and frontal regions. Focusing on LIFG (Left Inferior Frontal Gyrus), this enhancement was specific for the attended information, for those trials that benefitted from integration with a matching gesture. Together, our results suggest that attention modulates audiovisual processing and interaction, depending on the congruence and quality of the sensory input.

    Additional information

    link to preprint
  • Seijdel, N., Marshall, T. R., & Drijvers, L. (2023). Rapid invisible frequency tagging (RIFT): A promising technique to study neural and cognitive processing using naturalistic paradigms. Cerebral Cortex, 33(5), 1626-1629. doi:10.1093/cercor/bhac160.

    Abstract

    Frequency tagging has been successfully used to investigate selective stimulus processing in electroencephalography (EEG) or magnetoencephalography (MEG) studies. Recently, new projectors have been developed that allow for frequency tagging at higher frequencies (>60 Hz). This technique, rapid invisible frequency tagging (RIFT), provides two crucial advantages over low-frequency tagging as (i) it leaves low-frequency oscillations unperturbed, and thus open for investigation, and ii) it can render the tagging invisible, resulting in more naturalistic paradigms and a lack of participant awareness. The development of this technique has far-reaching implications as oscillations involved in cognitive processes can be investigated, and potentially manipulated, in a more naturalistic manner.
  • Loke, J., Seijdel, N., Snoek, L., Van der Meer, M., Van de Klundert, R., Quispel, E., Cappaert, N., & Scholte, H. S. (2022). A critical test of deep convolutional neural networks’ ability to capture recurrent processing in the brain using visual masking. Journal of Cognitive Neuroscience, 34(12): 10.1101/2022.01.30.478404, pp. 2390-2405. doi:10.1162/jocn_a_01914.

    Abstract

    Recurrent processing is a crucial feature in human visual processing supporting perceptual grouping, figure-ground segmentation, and recognition under challenging conditions. There is a clear need to incorporate recurrent processing in deep convolutional neural networks (DCNNs) but the computations underlying recurrent processing remain unclear. In this paper, we tested a form of recurrence in deep residual networks (ResNets) to capture recurrent processing signals in the human brain. Though ResNets are feedforward networks, they approximate an excitatory additive form of recurrence. Essentially, this form of recurrence consists of repeating excitatory activations in response to a static stimulus. Here, we used ResNets of varying depths (reflecting varying levels of recurrent processing) to explain electroencephalography (EEG) activity within a visual masking paradigm. Sixty-two humans and fifty artificial agents (10 ResNet models of depths - 4, 6, 10, 18 and 34) completed an object categorization task. We show that deeper networks (ResNet-10, 18 and 34) explained more variance in brain activity compared to shallower networks (ResNet-4 and 6). Furthermore, all ResNets captured differences in brain activity between unmasked and masked trials, with differences starting at ∼98ms (from stimulus onset). These early differences indicated that EEG activity reflected ‘pure’ feedforward signals only briefly (up to ∼98ms). After ∼98ms, deeper networks showed a significant increase in explained variance which peaks at ∼200ms, but only within unmasked trials, not masked trials. In summary, we provided clear evidence that excitatory additive recurrent processing in ResNets captures some of the recurrent processing in humans.
  • Seijdel, N., Loke, J., Van de Klundert, R., Van der Meer, M., Quispel, E., Van Gaal, S., De Haan, E. H., & Scholte, H. S. (2021). On the necessity of recurrent processing during object recognition: It depends on the need for scene segmentation. Journal of Neuroscience, 41(29), 6281-6289. doi:10.1523/JNEUROSCI.2851-20.2021.

    Abstract

    Although feedforward activity may suffice for recognizing objects in isolation, additional visual operations that aid object recognition might be needed for real-world scenes. One such additional operation is figure-ground segmentation, extracting the relevant features and locations of the target object while ignoring irrelevant features. In this study of 60 human participants (female and male), we show objects on backgrounds of increasing complexity to investigate whether recurrent computations are increasingly important for segmenting objects from more complex backgrounds. Three lines of evidence show that recurrent processing is critical for recognition of objects embedded in complex scenes. First, behavioral results indicated a greater reduction in performance after masking objects presented on more complex backgrounds, with the degree of impairment increasing with increasing background complexity. Second, electroencephalography (EEG) measurements showed clear differences in the evoked response potentials between conditions around time points beyond feedforward activity, and exploratory object decoding analyses based on the EEG signal indicated later decoding onsets for objects embedded in more complex backgrounds. Third, deep convolutional neural network performance confirmed this interpretation. Feedforward and less deep networks showed a higher degree of impairment in recognition for objects in complex backgrounds compared with recurrent and deeper networks. Together, these results support the notion that recurrent computations drive figure-ground segmentation of objects in complex scenes.SIGNIFICANCE STATEMENT The incredible speed of object recognition suggests that it relies purely on a fast feedforward buildup of perceptual activity. However, this view is contradicted by studies showing that disruption of recurrent processing leads to decreased object recognition performance. Here, we resolve this issue by showing that how object recognition is resolved and whether recurrent processing is crucial depends on the context in which it is presented. For objects presented in isolation or in simple environments, feedforward activity could be sufficient for successful object recognition. However, when the environment is more complex, additional processing seems necessary to select the elements that belong to the object and by that segregate them from the background.
  • Seijdel, N., Scholte, H. S., & de Haan, E. H. (2021). Visual features drive the category-specific impairments on categorization tasks in a patient with object agnosia. Neuropsychologia, 161: 108017. doi:10.1016/j.neuropsychologia.2021.108017.

    Abstract

    Object and scene recognition both require mapping of incoming sensory information to existing conceptual knowledge about the world. A notable finding in brain-damaged patients is that they may show differentially impaired performance for specific categories, such as for “living exemplars”. While numerous patients with category-specific impairments have been reported, the explanations for these deficits remain controversial. In the current study, we investigate the ability of a brain injured patient with a well-established category-specific impairment of semantic memory to perform two categorization experiments: ‘natural’ vs. ‘manmade’ scenes (experiment 1) and objects (experiment 2). Our findings show that the pattern of categorical impairment does not respect the natural versus manmade distinction. This suggests that the impairments may be better explained by differences in visual features, rather than by category membership. Using Deep Convolutional Neural Networks (DCNNs) as ‘artificial animal models’ we further explored this idea. Results indicated that DCNNs with ‘lesions’ in higher order layers showed similar response patterns, with decreased relative performance for manmade scenes (experiment 1) and natural objects (experiment 2), even though they have no semantic category knowledge, apart from a mapping between pictures and labels. Collectively, these results suggest that the direction of category-effects to a large extent depends, at least in MS′ case, on the degree of perceptual differentiation called for, and not semantic knowledge.

    Additional information

    data and code
  • Haan, E. H. F., Seijdel, N., Kentridge, R. W., & Heywood, C. A. (2020). Plasticity versus chronicity: Stable performance on category fluency 40 years post‐onset. Journal of Neuropsychology, 14(1), 20-27. doi:10.1111/jnp.12180.

    Abstract

    What is the long‐term trajectory of semantic memory deficits in patients who have suffered structural brain damage? Memory is, per definition, a changing faculty. The traditional view is that after an initial recovery period, the mature human brain has little capacity to repair or reorganize. More recently, it has been suggested that the central nervous system may be more plastic with the ability to change in neural structure, connectivity, and function. The latter observations are, however, largely based on normal learning in healthy subjects. Here, we report a patient who suffered bilateral ventro‐medial damage after presumed herpes encephalitis in 1971. He was seen regularly in the eighties, and we recently had the opportunity to re‐assess his semantic memory deficits. On semantic category fluency, he showed a very clear category‐specific deficit performing better that control data on non‐living categories and significantly worse on living items. Recent testing showed that his impairments have remained unchanged for more than 40 years. We suggest cautiousness when extrapolating the concept of brain plasticity, as observed during normal learning, to plasticity in the context of structural brain damage.
  • Seijdel, N., Tsakmakidis, N., De Haan, E. H. F., Bohte, S. M., & Scholte, H. S. (2020). Depth in convolutional neural networks solves scene segmentation. PLOS Computational Biology, 16: e1008022. doi:10.1371/journal.pcbi.1008022.

    Abstract

    Feed-forward deep convolutional neural networks (DCNNs) are, under specific conditions, matching and even surpassing human performance in object recognition in natural scenes. This performance suggests that the analysis of a loose collection of image features could support the recognition of natural object categories, without dedicated systems to solve specific visual subtasks. Research in humans however suggests that while feedforward activity may suffice for sparse scenes with isolated objects, additional visual operations ('routines') that aid the recognition process (e.g. segmentation or grouping) are needed for more complex scenes. Linking human visual processing to performance of DCNNs with increasing depth, we here explored if, how, and when object information is differentiated from the backgrounds they appear on. To this end, we controlled the information in both objects and backgrounds, as well as the relationship between them by adding noise, manipulating background congruence and systematically occluding parts of the image. Results indicate that with an increase in network depth, there is an increase in the distinction between object- and background information. For more shallow networks, results indicated a benefit of training on segmented objects. Overall, these results indicate that, de facto, scene segmentation can be performed by a network of sufficient depth. We conclude that the human brain could perform scene segmentation in the context of object identification without an explicit mechanism, by selecting or “binding” features that belong to the object and ignoring other features, in a manner similar to a very deep convolutional neural network.
  • Seijdel, N., Jahfari, S., Groen, I. I. A., & Scholte, H. S. (2020). Low-level image statistics in natural scenes influence perceptual decision-making. Scientific Reports, 10: 10573. doi:10.1038/s41598-020-67661-8.

    Abstract

    A fundamental component of interacting with our environment is gathering and interpretation of sensory information. When investigating how perceptual information influences decision-making, most researchers have relied on manipulated or unnatural information as perceptual input, resulting in findings that may not generalize to real-world scenes. Unlike simplified, artificial stimuli, real-world scenes contain low-level regularities that are informative about the structural complexity, which the brain could exploit. In this study, participants performed an animal detection task on low, medium or high complexity scenes as determined by two biologically plausible natural scene statistics, contrast energy (CE) or spatial coherence (SC). In experiment 1, stimuli were sampled such that CE and SC both influenced scene complexity. Diffusion modelling showed that the speed of information processing was affected by low-level scene complexity. Experiment 2a/b refined these observations by showing how isolated manipulation of SC resulted in weaker but comparable effects, with an additional change in response boundary, whereas manipulation of only CE had no effect. Overall, performance was best for scenes with intermediate complexity. Our systematic definition quantifies how natural scene complexity interacts with decision-making. We speculate that CE and SC serve as an indication to adjust perceptual decision-making based on the complexity of the input.

    Additional information

    supplementary materials data code and data
  • Seijdel, N., Sakmakidis, N., De Haan, E. H. F., Bohte, S. M., & Scholte, H. S. (2019). Implicit scene segmentation in deeper convolutional neural networks. In Proceedings of the 2019 Conference on Cognitive Computational Neuroscience (pp. 1059-1062). doi:10.32470/CCN.2019.1149-0.

    Abstract

    Feedforward deep convolutional neural networks (DCNNs) are matching and even surpassing human performance on object recognition. This performance suggests that activation of a loose collection of image
    features could support the recognition of natural object categories, without dedicated systems to solve specific visual subtasks. Recent findings in humans however, suggest that while feedforward activity may suffice for
    sparse scenes with isolated objects, additional visual operations ('routines') that aid the recognition process (e.g. segmentation or grouping) are needed for more complex scenes. Linking human visual processing to
    performance of DCNNs with increasing depth, we here explored if, how, and when object information is differentiated from the backgrounds they appear on. To this end, we controlled the information in both objects
    and backgrounds, as well as the relationship between them by adding noise, manipulating background congruence and systematically occluding parts of the image. Results indicated less distinction between object- and background features for more shallow networks. For those networks, we observed a benefit of training on segmented objects (as compared to unsegmented objects). Overall, deeper networks trained on natural
    (unsegmented) scenes seem to perform implicit 'segmentation' of the objects from their background, possibly by improved selection of relevant features.
  • Smits, A., Seijdel, N., Scholte, H., Heywood, C., Kentridge, R., & de Haan, E. (2019). Action blindsight and antipointing in a hemianopic patient. Neuropsychologia, 128, 270-275. doi:10.1016/j.neuropsychologia.2018.03.029.

    Abstract

    Blindsight refers to the observation of residual visual abilities in the hemianopic field of patients without a functional V1. Given the within- and between-subject variability in the preserved abilities and the phenomenal experience of blindsight patients, the fine-grained description of the phenomenon is still debated. Here we tested a patient with established “perceptual” and “attentional” blindsight (c.f. Danckert and Rossetti, 2005). Using a pointing paradigm patient MS, who suffers from a complete left homonymous hemianopia, showed clear above chance manual localisation of ‘unseen’ targets. In addition, target presentations in his blind field led MS, on occasion, to spontaneous responses towards his sighted field. Structural and functional magnetic resonance imaging was conducted to evaluate the magnitude of V1 damage. Results revealed the presence of a calcarine sulcus in both hemispheres, yet his right V1 is reduced, structurally disconnected and shows no fMRI response to visual stimuli. Thus, visual stimulation of his blind field can lead to “action blindsight” and spontaneous antipointing, in absence of a functional right V1. With respect to the antipointing, we suggest that MS may have registered the stimulation and subsequently presumes it must have been in his intact half field.

    Additional information

    video
  • Groen, I. I. A., Jahfari, S., Seijdel, N., Ghebreab, S., Lamme, V. A. F., & Scholte, H. S. (2018). Scene complexity modulates degree of feedback activity during object detection in natural scenes. PLoS Computational Biology, 14: e1006690. doi:10.1371/journal.pcbi.1006690.

    Abstract

    Selective brain responses to objects arise within a few hundreds of milliseconds of neural processing, suggesting that visual object recognition is mediated by rapid feed-forward activations. Yet disruption of neural responses in early visual cortex beyond feed-forward processing stages affects object recognition performance. Here, we unite these discrepant findings by reporting that object recognition involves enhanced feedback activity (recurrent processing within early visual cortex) when target objects are embedded in natural scenes that are characterized by high complexity. Human participants performed an animal target detection task on natural scenes with low, medium or high complexity as determined by a computational model of low-level contrast statistics. Three converging lines of evidence indicate that feedback was selectively enhanced for high complexity scenes. First, functional magnetic resonance imaging (fMRI) activity in early visual cortex (V1) was enhanced for target objects in scenes with high, but not low or medium complexity. Second, event-related potentials (ERPs) evoked by target objects were selectively enhanced at feedback stages of visual processing (from ~220 ms onwards) for high complexity scenes only. Third, behavioral performance for high complexity scenes deteriorated when participants were pressed for time and thus less able to incorporate the feedback activity. Modeling of the reaction time distributions using drift diffusion revealed that object information accumulated more slowly for high complexity scenes, with evidence accumulation being coupled to trial-to-trial variation in the EEG feedback response. Together, these results suggest that while feed-forward activity may suffice to recognize isolated objects, the brain employs recurrent processing more adaptively in naturalistic settings, using minimal feedback for simple scenes and increasing feedback for complex scenes.

    Additional information

    data via OSF

Share this page