A mobile observer samples sequences of narrow-field projections of configurations in ambient space. The so-called structure-from-motion problem is to infer the structure of these spatial configurations from the sequence of projections. For rigid transformations, a unique metrical reconstruction is known to be possible from three orthographic views of four points. However, human observers seem able to obtain much shape information from a mere pair of views, as is evident in the case of binocular stereo. Moreover, human observers seem to find little use for the information provided by additional views, even though some improvement certainly occurs. The rigidity requirement in its strict form is also relaxed. We indicate how solutions of the structure-from-motion problem can be stratified in such a way that one explicitly knows at which stages various a priori assumptions enter and specific geometrical expertise is required. An affine stage is identified at which only smooth deformation is assumed (thus no rigidity constraint is involved) and no metrical concepts are required. This stage allows one to find the spatial configuration (modulo an affinity) from two views. The addition of metrical methods allows one to find shape from two views, modulo a relief transformation (depth scaling and shear). The addition of a third view then merely serves to settle the calibration. Results of a numerical experiment are discussed.
Subjects adjusted a local gauge figure such as to perceptually "fit" the apparent surfaces of objects depicted in photographs. We obtained a few hundred data points per session, covering the picture according to a uniform lattice. Settings were repeated 3 times for each of 3 subjects. Almost all of the variability resided in the slant; the relative spread in the slant was about 25% (Weber fraction). The tilt was reproduced with a typical spread of about 10°. The rank correlation of the slant settings of different observers was high, thus the slant settings of different subjects were monotonically related. The variability could be predicted from the scatter in repeated settings by the individual observers. Although repeated settings by a single observer agreed within 5%, observers did not agree on the value of the slant, even on the average. Scaling factors of a doubling in the depth dimension were encountered between different subjects. The data conformed quite well to some hypothetical fiducial global surface, the orientation of which was "probed" by the subject's local settings. The variability was completely accounted for by singleobserver scatter. These conclusions are based upon an analysis of the internal structure of the local settings. We did not address the problem of veridicality, that is, conformity to some "real object."
It is argued that the internal model of any object must take the form of a function, such that for any intended action the resulting reafference is predictable. This function can be derived explicitly for the case of visual perception of rigid bodies by ambulant observers. The function depends on physical causation, not physiology; consequently, one can make a priori statements about possible internal models. A posteriori it seems likely that the orientation sensitive units described by Hubel and Wiesel constitute a physiological substrate subserving the extraction of the invariants of this function. The function is used to define a measure for the visual complexity of solid shape. Relations with Gestalt theories of perception are discussed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.