SUMMARYFor robust visual tracking, the main challenges of a subspace representation model can be attributed to the difficulty in handling various appearances of the target object. Traditional subspace learning tracking algorithms neglected the discriminative correlation between different multi-view target samples and the effectiveness of sparse subspace learning. For learning a better subspace representation model, we designed a discriminative graph to model both the labeled target samples with various appearances and the updated foreground and background samples, which are selected using an incremental updating scheme. The proposed discriminative graph structure not only can explicitly capture multi-modal intraclass correlations within labeled samples but also can obtain a balance between within-class local manifold and global discriminative information from foreground and background samples. Based on the discriminative graph, we achieved a sparse embedding by using L 2,1 -norm, which is incorporated to select relevant features and learn transformation in a unified framework. In a tracking procedure, the subspace learning is embedded into a Bayesian inference framework using compound motion estimation and a discriminative observation model, which significantly makes localization effective and accurate. Experiments on several videos have demonstrated that the proposed algorithm is robust for dealing with various appearances, especially in dynamically changing and clutter situations, and has better performance than alternatives reported in the recent literature. key words: visual tracking, sparse subspace, discriminative graph