2012
DOI: 10.1162/neco_a_00247
|View full text |Cite
|
Sign up to set email alerts
|

Learning Intermediate-Level Representations of Form and Motion from Natural Movies

Abstract: We present a model of intermediate-level visual representation that is based on learning invariances from movies of the natural environment.The model is composed of two stages of processing: an early feature representation layer and a second layer in which invariances are explicitly represented. Invariances are learned as the result of factoring apart the temporally stable and dynamic components embedded in the early feature representation. The structure contained in these components is made explicit in the ac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
71
2

Year Published

2012
2012
2023
2023

Publication Types

Select...
8
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 79 publications
(78 citation statements)
references
References 64 publications
1
71
2
Order By: Relevance
“…While such ambiguity is fundamental to many visual processes (Koenderink and Van Doorn, 1997) it is acute in the case of silhouettes where local deformation of the silhouette depends upon motion and the curvature of the surface in complex ways. Our calculation of a Motion Index does not decompose changes of the silhouette into different factors of form and motion (Cadieu and Olshausen, 2012;Tomasi and Kanade, 1992). Thus, the brain activity related to our motion index might be due to motion features, form features or the process of factorizing the image motion into form and motion features.…”
Section: Discussionmentioning
confidence: 99%
“…While such ambiguity is fundamental to many visual processes (Koenderink and Van Doorn, 1997) it is acute in the case of silhouettes where local deformation of the silhouette depends upon motion and the curvature of the surface in complex ways. Our calculation of a Motion Index does not decompose changes of the silhouette into different factors of form and motion (Cadieu and Olshausen, 2012;Tomasi and Kanade, 1992). Thus, the brain activity related to our motion index might be due to motion features, form features or the process of factorizing the image motion into form and motion features.…”
Section: Discussionmentioning
confidence: 99%
“…One powerful class of models specifies the distribution in a top-down manner in terms of latent variables which explain the structure in the visual stimuli (Olshausen and Field, 1996;Hyvärinen et al, 2009;Karklin and Lewicki, 2009;Zoran and Weiss, 2009;Ranzato and Hinton, 2010;Cadieu and Olshausen, 2012). Another class of models corresponds to a bottom-up approach where the visual stimuli are processed in multiple layers of computation (Osindero et al, 2006;Köster and Hyvärinen, 2010;Gutmann and Hyvärinen, 2012b).…”
Section: Introductionmentioning
confidence: 99%
“…These results will report in-depth comparison of a single challenging video sequence (the Foreman sequence 2 ) as well as aggregate statistics from a batch of video from a BBC nature documentary (as used in [52]). The documentary footage is valuable as broad comparison because it contains many different types of motion, including static frames with localized changes and highly dynamic frames with moving subjects across large portions of the visual field.…”
Section: B Cs Recovery Of Natural Video Sequencesmentioning
confidence: 99%