Systems for still-to-video face recognition (FR) are typically used to detect target individuals in watch-list screening applications. These surveillance applications are challenging because the appearance of faces changes according to capture conditions, and very few reference stills are available a priori for enrollment. To improve performance, an adaptive appearance model tracker (AAMT) is proposed for on-line learning of a track-face-model linked to each individual appearing in the scene. Meanwhile, these models are matched over successive frames against stored gallery-face-models, extracted from reference still images of each target individual (enrolled to the system) for robust spatiotemporal FR. In addition, compared to the gallery-face-models produced by selfupdating FR systems, the track-face-models (produced by the AAMT-FR system) are updated from facial captures that are more reliably selected, and can incorporate greater intra-class variations from the operational environment. Track-facemodels allow selecting facial captures for modeling more reliably than self-updating FR systems, and can incorporate a greater diversity of intra-class variation from the operational environment. Performance of the proposed approach is compared with several state-of-the-art FR systems on videos from the Chokepoint dataset when a single reference template per target individual is stored in the gallery. Experimental results show that the proposed system can achieve a significantly higher level of FR performance, especially when the diverse facial appearances captured through AAMT correspond to that of reference stills.