Under identical viewing conditions, observers made two types of judgement about the shape of stereoscopically defined surfaces: one required an estimate of viewing distance for correct performance (e.g. setting the depth of a hemi-cylinder to equal its half-height or a dihedral angle to 90 deg), the other did not (matching the depth of, for example, sinusoidal corrugations or hemi-cylinders presented at two distances). Depth constancy for the two types of task was about 75% and 100%, respectively. We argue that observers may use a simple "direct" strategy to perform the depth matching task rather than constructing and comparing a metric representation of each surface.
Binocular disparity and motion parallax are powerful cues to the relative depth between objects. However to recover absolute depth, either additional scaling parameters are required to calibrate the information provided by each cue, or it can be recovered through the combination of information from both cues (Richards, W. (1985). Structure from stereo and motion. Journal of the Optical Society of America, 2, 343-349). However, not all tasks necessarily require a full specification of the absolute depth structure of a scene and so psychophysical performance may vary depending on the amount of information available, and the degree to which absolute depth structure is required. The experiments reported here used three different tasks that varied in the type of geometric information required in order for them to be completed successfully. These included a depth nulling task, a depth-matching task, and an absolute depth judgement (shape) task. Real world stimuli were viewed (i) monocularly with head movements, (ii) binocularly and static, or (iii) binocularly with head movements. No effect of viewing condition was found whereas there was a large effect of task. Performance was accurate on the matching and nulling tasks and much less accurate on the shape task. The fact that the same perceptual distortions were not evident in all tasks suggests that the visual system can switch strategy according to the demands of the particular task. No evidence was found to suggest that the visual system could exploit the simultaneous presence of disparity and motion parallax.
As we move through the world, our eyes acquire a sequence of images. The information from this sequence is sufficient to determine the structure of a three-dimensional scene, up to a scale factor determined by the distance that the eyes have moved. Previous evidence shows that the human visual system accounts for the distance the observer has walked and the separation of the eyes when judging the scale, shape, and distance of objects. However, in an immersive virtual-reality environment, observers failed to notice when a scene expanded or contracted, despite having consistent information about scale from both distance walked and binocular vision. This failure led to large errors in judging the size of objects. The pattern of errors cannot be explained by assuming a visual reconstruction of the scene with an incorrect estimate of interocular separation or distance walked. Instead, it is consistent with a Bayesian model of cue integration in which the efficacy of motion and disparity cues is greater at near viewing distances. Our results imply that observers are more willing to adjust their estimate of interocular separation or distance walked than to accept that the scene has changed in size.
The present study compared the relative effectiveness of differential perspective and vergence angle manipulations in scaling depth from horizontal disparities. When differential perspective and vergence angle were manipulated together (to simulate a range of different viewing distances from 28 cm to infinity), approximately 35% of the scaling required for complete depth constancy was obtained. When manipulated separately the relative influence of each cue depended crucially on the size of the visual display. Differential perspective was only effective when the display size was sufficiently large (i.e., greater than 20 deg) whereas the influence of vergence angle, although evident at each display size, was greatest in the smaller displays. For each display size the independent effects of the two cues were approximately additive. Perceived size (and two-dimensional spacing of elements) was also affected by manipulations of differential perspective and vergence. These results confirm that both differential perspective and vergence are effective in scaling the perceived two-dimensional size of elements and the perceived depth from horizontal disparities. They also show that the effect of the two cues in combination is approximately equal to the sum of their individual effects.
The literature on vertical disparity is complicated by the fact that several different definitions of the term "vertical disparity" are in common use, often without a clear statement about which is intended or a widespread appreciation of the properties of the different definitions. Here, we examine two definitions of retinal vertical disparity: elevation-latitude and elevation-longitude disparities. Near the fixation point, these definitions become equivalent, but in general, they have quite different dependences on object distance and binocular eye posture, which have not previously been spelt out. We present analytical approximations for each type of vertical disparity, valid for more general conditions than previous derivations in the literature: we do not restrict ourselves to objects near the fixation point or near the plane of regard, and we allow for non-zero torsion, cyclovergence, and vertical misalignments of the eyes. We use these expressions to derive estimates of the latitude and longitude vertical disparities expected at each point in the visual field, averaged over all natural viewing. Finally, we present analytical expressions showing how binocular eye position-gaze direction, convergence, torsion, cyclovergence, and vertical misalignment-can be derived from the vertical disparity field and its derivatives at the fovea.
Cue combination rules have often been applied to the perception of surface shape but not to judgements of object location. Here, we used immersive virtual reality to explore the relationship between different cues to distance. Participants viewed a virtual scene and judged the change in distance of an object presented in two intervals, where the scene changed in size between intervals (by a factor of between 0.25 and 4). We measured thresholds for detecting a change in object distance when there were only 'physical' (stereo and motion parallax) or 'texture-based' cues (independent of the scale of the scene) and used these to predict biases in a distance matching task. Under a range of conditions, in which the viewing distance and position of the target relative to other objects was varied, the ratio of 'physical' to 'texture-based' thresholds was a good predictor of biases in the distance matching task. The cue combination approach, which successfully accounts for our data, relies on quite different principles from those underlying traditional models of 3D reconstruction.
Using an immersive virtual reality system, we measured the ability of observers to detect the rotation of an object when its movement was yoked to the observer's own translation. Most subjects had a large bias such that a static object appeared to rotate away from them as they moved. Thresholds for detecting target rotation were similar to those for an equivalent speed discrimination task carried out by static observers, suggesting that visual discrimination is the predominant limiting factor in detecting target rotation. Adding a stable visual reference frame almost eliminated the bias. Varying the viewing distance of the target had little effect, consistent with observers underestimating distance walked. However, accuracy of walking to a briefly presented visual target was high and not consistent with an underestimation of distance walked. We discuss implications for theories of a task-independent representation of visual space.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.