Abstract-This paper presents an object tracking method for object-based video processing which uses a two-dimensional (2-D) Gabor wavelet transform (GWT) and a 2-D golden section algorithm. An object in the current frame is modeled by local features from a number of the selected feature points, and the global placement of these feature points. The feature points are stochastically selected based on the energy of their GWT coefficients. Points with higher energy have a higher probability of being selected since they are visually more important. The amplitudes of the GWT coefficients of a feature point are then used as the local feature. This takes advantage of the characteristics of Gabor wavelets which are highly localized in both the time and the frequency domains. The global placement of the feature points is determined by a 2-D mesh whose feature is the area of the triangles formed by the feature points. In this way, a local feature is represented by a GWT coefficient amplitude vector, and a global feature is represented by a triangle area vector. One advantage of the 2-D mesh is that the direction of its triangle area vector is invariant to affine transform. Consequently, the similarity between two local features or two global features can be defined as a function of the angle and the length ratio between two vectors, and the overall similarity between two objects is a weighted sum of the local and global similarities. In order to find the corresponding object in the next frame, the 2-D golden section algorithm is employed, and this can be shown to be the fastest algorithm to find the maximum of a unimodal function. Our results show that the method is robust to object deformation and supports object tracking in noisy video sequences.Index Terms-Content-based video, feature points, Gabor wavelets, golden section, mesh, object-based video, object tracking.