2011
DOI: 10.1038/nn.2889
|View full text |Cite
|
Sign up to set email alerts
|

Metamers of the ventral stream

Abstract: The human capacity to recognize complex visual patterns emerges in a sequence of brain areas known as the ventral stream, beginning with primary visual cortex (V1). We develop a population model for mid-ventral processing, in which non-linear combinations of V1 responses are averaged within receptive fields that grow with eccentricity. To test the model, we generate novel forms of visual metamers — stimuli that differ physically, but look the same. We develop a behavioral protocol that uses metameric stimuli t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

39
825
7

Year Published

2013
2013
2022
2022

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 589 publications
(871 citation statements)
references
References 49 publications
39
825
7
Order By: Relevance
“…To control for low-level confounds, we tested the (trivial) pixel model, as well as SIFT, a simple baseline computer vision model (30). We also evaluated a V1-like Gabor-based model (25), a V2-like conjunction-ofGabors model (31), and HMAX (17,28), a model targeted at explaining higher ventral cortex and that has receptive field sizes similar to those observed in IT. The HMAX model can be trained in a domain-specific fashion, and to give it the best chance of success, we performed this training using the benchmark images themselves (see SI Text for more information on the comparison models).…”
Section: Resultsmentioning
confidence: 99%
“…To control for low-level confounds, we tested the (trivial) pixel model, as well as SIFT, a simple baseline computer vision model (30). We also evaluated a V1-like Gabor-based model (25), a V2-like conjunction-ofGabors model (31), and HMAX (17,28), a model targeted at explaining higher ventral cortex and that has receptive field sizes similar to those observed in IT. The HMAX model can be trained in a domain-specific fashion, and to give it the best chance of success, we performed this training using the benchmark images themselves (see SI Text for more information on the comparison models).…”
Section: Resultsmentioning
confidence: 99%
“…Considering that curvature can also be regarded as a higher-order combinatory feature of Gabor filter output, it may be that there is shared tuning between these contour fragments and textures. However, texture synthesis does not capture global image structures and cannot reproduce inhomogeneous image features such as object contours (20), although it may explain visual discriminability of those features in peripheral visual field (21,41,42). It is therefore also plausible that tuning to geometric shapes or curvatures is hidden in the unexplained variances in the neural responses recorded in this study, or that they can be attributed to a different population of V4 neurons tuned to those features.…”
Section: Discussionmentioning
confidence: 97%
“…Their algorithm is particularly inspiring because PS statistics use filters and computations that share biological properties. It was recently shown, for example, that a version of their synthesis algorithm can generate perceptually indistinguishable visual images (visual metamers) (21) and that naturalistic textures incorporating these summary statistics strongly activate neurons in V2, compared with noise images lacking these features (7).…”
mentioning
confidence: 99%
“…Crowding occurs when the perception of a target stimulus, which would be visible if presented in isolation, is impaired by other surrounding items in close spatial proximity (Bouma, 1970;Pelli & Tillman, 2008). Although there are many models of crowding (for a review, see Whitney & Levi, 2011), a popular account characterises it as a pooling process that regularises the noisy representation of position in the periphery by averaging the target and flanker identities (Freeman & Simoncelli, 2011;Greenwood, Bex, & Dakin, 2009;Parkes, Lund, Angelucci, Solomon, & Morgan, 2001). Functionally, then, crowding may reflect the blurring and loss of information across space in the way that OSM reflects the blurring and loss of information across time.…”
mentioning
confidence: 99%