Five experiments on the effects of changes of depth orientation on (a) priming the naming of briefly flashed familiar objects, (b) matching individual sample volumes (geons), and (c) classifying unfamiliar objects (that could readily be decomposed into an arrangement of distinctive geons) all revealed immediate (i.e., not requiring practice) depth invariance. The results can be understood in terms of 3 conditions derived from a model of object recognition (I. Biederman, 1987; J. E. Hummel & I. Biederman, 1992) that have to be satisfied for immediate depth invariance: (a) that the stimuli be capable of activating viewpoint-invariant (e.g., geon) structural descriptions (GSDs), (b) that the GSDs be distinctive (different) for each stimulus, and (c) that the same GSD be activated in original and tested views. The stimuli used in several recent experiments documenting extraordinary viewpoint dependence violated these conditions.
Infants learn less from a televised demonstration than from a live demonstration, the video deficit effect. The present study employs a novel approach, using touch-screen technology to examine 15-month-olds' transfer of learning. Infants were randomly assigned either to within-dimension (2D/ 2D or 3D/3D) or cross-dimension (3D/2D or 2D/3D) conditions. For the within-dimension conditions, an experimenter demonstrated an action by pushing a virtual button on a 2D screen or a real button on a 3D object. Infants were then given the opportunity to imitate using the same screen or object. For the 3D/2D condition, an experimenter demonstrated the action on the 3D object, and infants were given the opportunity to reproduce the action on a 2D touch-screen (and vice versa for the 2D/3D condition). Infants produced significantly fewer target actions in the cross-dimension conditions than in the within-dimension conditions. These findings have important implications for infants understanding and learning from 2D images and for their using 2D media as the basis of actions in the real world.
Biederman and P. C. Gerhardstein (1993) demonstrated that a representation specifying a distinctive arrangement of viewpoint-invariant parts (a geon structural description, [GSD]) dramatically reduced the costs of rotation in depth. M. J. Tarr and H. H. Bulthoff (1995) attempt to make a case for viewpoint-dependent mechanisms, such as mental rotation. Their suggestion that GSDs enjoy no special status in reducing the effects of depth rotation is contradicted by a wealth of direct experimental evidence as well as an inadvertent experiment that found no evidence for the spontaneous employment of mental rotation. Their complaint against geon theory's account of entry-level classification rests on a mistaken and unwarranted attribution that geon theory assumes a one-to-one correspondence between GSDs and entry-level names. GSDs provide a representation that distinguishes most entry-and subordinate-level classes and explains why complex objects are described as an arrangement of viewpoint-invariant parts. Consider the nonsense object in Figure 1. When first viewed, how did the reader know that the object was one never encountered previously? Why was the reader fairly confident that he or she would know what the object would look like if rotated 30°? The large central block would still look like a block and the vertical cylinder and wedge on top of the block would still be on top of the block. The zigzag cross brace connecting the tilting cylinder (ending in a cone) to the wedge would still enjoy the same relation if rotated 30°. These words denoting parts and relations are easily matched to the corresponding regions of the image. Geon theory (Biederman, 1987; Hummel & Biederman, 1992) seeks to account for these readily evident capacities and characteristics of human object recognition by positing that objects are represented as an arrangement of simple viewpoint-invariant parts (geons) and relations, termed a geon structural description (GSD). The resultant viewpointinvariant representation is designed to account for many of the entry-level shape-based classifications, such as distinguishing between a chair, an elephant, and a frying pan. The theory also provides an account of the vast majority of
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.