The need for automated and efficient systems for tracking full animal pose has increased with the complexity of behavioral data and analyses. Here we introduce LEAP (LEAP estimates animal pose), a deep-learning-based method for predicting the positions of animal body parts. This framework consists of a graphical interface for labeling of body parts and training the network. LEAP offers fast prediction on new data, and training with as few as 100 frames results in 95% of peak performance. We validated LEAP using videos of freely behaving fruit flies and tracked 32 distinct points to describe the pose of the head, body, wings and legs, with an error rate of <3% of body length. We recapitulated reported findings on insect gait dynamics and demonstrated LEAP’s applicability for unsupervised behavioral classification. Finally, we extended the method to more challenging imaging situations and videos of freely moving mice.
Recent work quantifying postural dynamics has attempted to define the repertoire of behaviors across the entire data set, we use a technique we refer to as cluster sampling. A simple random 117 subset of the movie images are grouped via k-means clustering and then these images are 118 sampled uniformly across groups for labeling. The grouping is based on linear correlations 119 between pixel intensities in the images as a proxy measure for similarity in body pose. The 120 diversity of poses represented using this method can be observed in the centroids of each of the 121 clusters identified (Supplementary Fig. 2). 122 123 Poses in each training image are labeled using a custom GUI with draggable body part markers 124 that form a skeleton (Fig. 1b). For the fruit fly, we track four points on each of the six legs, two 125 points on the wing tips, three points on the thorax and abdomen, and three points on the head 126 for a total of 32 points in every frame. These points were chosen to align with known Drosophila 127 body joints (Supplementary Fig. 3). For every training image, the user drags each skeleton 128 point to the appropriate body part and the program saves the label positions into a self-129 contained file. To enhance the size of the training image set further without the need for hand 130 labeling more frames, we augment the dataset by applying small random rotations and body-131 axis reflections to generate new samples from the labeled data. As the neural network 132 processes the raw images, the rotated and reflected images add new information that the 133 network can use during training. 134 135
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.