Automated behavior quantification requires accurate tracking of animals. Simultaneous tracking of multiple animals, particularly those lacking visual identifiers, is particularly challenging. Here we propose a markerless video-based tool to simultaneously track two socially interacting mice of the same color. It incorporates conventional handcrafted tracking and deep learning based techniques, which are trained on a small number of labeled images from a very basic, uncluttered experimental setup. The output consists of body masks and coordinates of the snout and tail-base for each mouse. The method was tested on a series of cross-setup videos recorded under commonly used experimental conditions including bedding in the cage and fiberoptic or headstage implants on the mice. Results obtained without any human intervention showed the effectiveness of the proposed approach, evidenced by a near elimination of identities switches and a 10% improvement in tracking accuracy. This suggests that the hybrid approach could be valuable for studying group behaviors, such as social interaction. This novel approach addresses problems of mistaken identities and lost information on key anatomical features that are common in existing methods. Finally, we demonstrated an application of this approach in studies of social behaviour of mice, by using it to quantify and compare interactions between pairs of mice in which some are anosmic, i.e. unable to smell. Our results indicated loss of olfaction impaired typical snout-directed social recognition behaviors of mice, while non-snout-directed social behaviours were enhanced.