We present a technique for the computation of 2D component velocity from image sequences. Initially, the image sequence is represented by a family of spatiotemporal velocity-tuned linear filters. Component velocity, computed from spatiotemporal responses of identically tuned filters, is expressed in terms of the local first-order behavior of surfaces of constant phase. Justification for this definition is discussed from the perspectives of both 2D image translation and deviations from translation that are typical in perspective projections of 3D scenes. The resulting technique is predominantly linear, efficient, and suitable for parallel processing. Moreover, it is local in space-time, robust with respect to noise, and permits multiple estimates within a single neighborhood. Promising quantitative results are reported from experiments with realistic image sequences, including cases with sizeable perspective deformation.
I IntroductionThis article addresses the quantitative measurement of velocity in image sequences. The important issues are (1) the accuracy with which velocity can be computed;(2) robustness with respect to smooth contrast variations and affine deformation (i.e., deviations from 2D image translation that are typical in perspective projections of 3D scenes); (3) localization in space-time; (4) noise robustness; and (5) the ability to discern different velocities within a single neighborhood. Our approach is based on the phase information in a local-frequency representation of the image sequence that is produced by a family of velocity-tuned linear filters. The velocity measurements are limited to component velocity: the projected components of 2D velocity onto directions normal to oriented structure in the image (a definition is given in section 3). The combination of these measurements to derive the full 2D velocity is briefly discussed.Our reasons for concentrating on component velocity (also referred to as normal velocity) stem from a desire for local measurements, and the well-known aperture problem (Mart and Ullman 1981). Local measurements allow smoothly varying velocity fields to be estimated based on translational image velocity as opposed to more complicated descriptions of the velocity field over larger image patches. However, in narrow spatiotemporal apertures the intensity structure is often roughly one-dimensional so that only one component of the image velocity can be accurately determined. To obtain full 2D velocity fields, larger space-time support is therefore required. In our view, the common assumptions of smoothness, uniqueness, and the coherence of neighboring measurements that are involved in combining local measurements to determine 2D velocity, to fill in regions without measurements, and to reduce the effects of noise, should be viewed as aspects of interpretation, and as such, are distinct issues. In considering just normal components of velocity we hope to obtain more accurate estimates of motion within smaller apertures, which leads to better spatial resolution of veloci...