Abstract:Visual processing is thought to function in a coarse-to-fine manner. Low spatial frequencies (LSF), conveying coarse information, would be processed early to generate predictions. These LSF-based predictions would facilitate the further integration of high spatial frequencies (HSF), conveying fine details. The predictive role of LSF might be crucial in automatic face processing, where high performance could be explained by an accurate selection of clues in early processing. In the present study, we used a visu… Show more
“…This suggests that when face and car images are sufficiently distorted the model ceases to output waveforms comparable with those in the training data. It is not unreasonable to suspect that changes to the appearance of images might similarly influence ERP waveforms [32][33][34].…”
Objective: Event-related potential (ERP) sensitivity to faces is predominantly characterized by an N170 peak that has greater amplitude and shorter latency when elicited by human faces than images of other objects. We aimed to develop a computational model of visual ERP generation to study this phenomenon which consisted of a three-dimensional convolutional neural network (CNN) connected to a recurrent neural network (RNN).
Approach: The CNN provided image representation learning, complimenting sequence learning of the RNN for modeling visually-evoked potentials. We used open-access data from ERP CORE (40 subjects) to develop the model, generated synthetic images for simulating experiments with a generative adversarial network, then collected additional data (16 subjects) to validate predictions of these simulations. For modeling, visual stimuli presented during ERP experiments were represented as sequences of images (time x pixels). These were provided as inputs to the model. By filtering and pooling over spatial dimensions, the CNN transformed these inputs into sequences of vectors that were passed to the RNN. The ERP waveforms evoked by visual stimuli were provided to the RNN as labels for supervised learning. The whole model was trained end-to-end using data from the open-access dataset to reproduce ERP waveforms evoked by visual events. 
Main results: Cross-validation model outputs strongly correlated with open-access (r = 0.98) and validation study data (r = 0.78). Open-access and validation study data correlated similarly (r = 0.81). Some aspects of model behavior were consistent with neural recordings while others were not, suggesting promising albeit limited capacity for modeling the neurophysiology of face-sensitive ERP generation.
Significance: The approach developed in this work is potentially of significant value for visual neuroscience research, where it may be adapted for multiple contexts to study computational relationships between visual stimuli and evoked neural activity.
“…This suggests that when face and car images are sufficiently distorted the model ceases to output waveforms comparable with those in the training data. It is not unreasonable to suspect that changes to the appearance of images might similarly influence ERP waveforms [32][33][34].…”
Objective: Event-related potential (ERP) sensitivity to faces is predominantly characterized by an N170 peak that has greater amplitude and shorter latency when elicited by human faces than images of other objects. We aimed to develop a computational model of visual ERP generation to study this phenomenon which consisted of a three-dimensional convolutional neural network (CNN) connected to a recurrent neural network (RNN).
Approach: The CNN provided image representation learning, complimenting sequence learning of the RNN for modeling visually-evoked potentials. We used open-access data from ERP CORE (40 subjects) to develop the model, generated synthetic images for simulating experiments with a generative adversarial network, then collected additional data (16 subjects) to validate predictions of these simulations. For modeling, visual stimuli presented during ERP experiments were represented as sequences of images (time x pixels). These were provided as inputs to the model. By filtering and pooling over spatial dimensions, the CNN transformed these inputs into sequences of vectors that were passed to the RNN. The ERP waveforms evoked by visual stimuli were provided to the RNN as labels for supervised learning. The whole model was trained end-to-end using data from the open-access dataset to reproduce ERP waveforms evoked by visual events. 
Main results: Cross-validation model outputs strongly correlated with open-access (r = 0.98) and validation study data (r = 0.78). Open-access and validation study data correlated similarly (r = 0.81). Some aspects of model behavior were consistent with neural recordings while others were not, suggesting promising albeit limited capacity for modeling the neurophysiology of face-sensitive ERP generation.
Significance: The approach developed in this work is potentially of significant value for visual neuroscience research, where it may be adapted for multiple contexts to study computational relationships between visual stimuli and evoked neural activity.
“…This suggests that when face and car images are sufficiently distorted the model ceases to output waveforms comparable with those in the training data. It is not unreasonable to suspect that changes to the appearance of images might similarly influence ERP waveforms [31][32][33] .…”
Event-related potential (ERP) sensitivity to faces is predominantly characterized by an N170 peak that has greater amplitude and shorter latency when elicited by human faces than images of other objects. We developed a computational model of visual ERP generation to study this phenomenon which consisted of a convolutional neural network (CNN) connected to a recurrent neural network (RNN). We used open-access data to develop the model, generated synthetic images for simulating experiments, then collected additional data to validate predictions of these simulations. For modeling, visual stimuli presented during ERP experiments were represented as sequences of images (time x pixels). These were provided as inputs to the model. The CNN transformed these inputs into sequences of vectors that were passed to the RNN. The ERP waveforms evoked by visual stimuli were provided to the RNN as labels for supervised learning. The whole model was trained end-to-end using data from the open-access dataset to reproduce ERP waveforms evoked by visual events. Cross-validation model outputs strongly correlated with open-access (r = 0.98) and validation study data (r = 0.78). Open-access and validation study data correlated similarly (r = 0.81). Some aspects of model behavior were consistent with neural recordings while others were not, suggesting promising albeit limited capacity for modeling the neurophysiology of face-sensitive ERP generation.
“…There are alternative, albeit perhaps less exotic or conceptually appealing, explanations that have been proposed to account for differential responses to passive auditory oddball stimulation (Butler, 1968; May, 2021; May & Tiitinen, 2010; O'Reilly, 2021a, 2021b; O'Reilly & Conway, 2021; O'Reilly & O'Reilly, 2021). However, these are rarely considered preferable to the well‐supported predictive coding framework, even when dutifully considered (Lacroix et al, 2022). As they currently stand, neither predictive coding nor alternative sensory processing theories succinctly encapsulate all of the physiological phenomena observed in response to sequences of sounds, which shall therefore remain the purview of future efforts directed towards improving our understanding these processes.…”
The ability to respond appropriately to sensory information received from the external environment is among the most fundamental capabilities of central nervous systems. In the auditory domain, processes underlying this behaviour are studied by measuring auditory‐evoked electrophysiology during sequences of sounds with predetermined regularities. Identifying neural correlates of ensuing auditory novelty responses is supported by research in experimental animals. In the present study, we reanalysed epidural field potential recordings from the auditory cortex of anaesthetised mice during frequency and intensity oddball stimulation. Multivariate pattern analysis (MVPA) and hierarchical recurrent neural network (RNN) modelling were adopted to explore these data with greater resolution than previously considered using conventional methods. Time‐wise and generalised temporal decoding MVPA approaches revealed previously underestimated asymmetry between responses to sound‐level transitions in the intensity oddball paradigm, in contrast with tone frequency changes. After training, the cross‐validated RNN model architecture with four hidden layers produced output waveforms in response to simulated auditory inputs that were strongly correlated with grand‐average auditory‐evoked potential waveforms (r2 > .9). Units in hidden layers were classified based on their temporal response properties and characterised using principal component analysis and sample entropy. These demonstrated spontaneous alpha rhythms, sound onset and offset responses and putative ‘safety’ and ‘danger’ units activated by relatively inconspicuous and salient changes in auditory inputs, respectively. The hypothesised existence of corresponding biological neural sources is naturally derived from this model. If proven, this could have significant implications for prevailing theories of auditory processing.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.