The need for automated and efficient systems for tracking full animal pose has increased with the complexity of behavioral data and analyses. Here we introduce LEAP (LEAP estimates animal pose), a deep-learning-based method for predicting the positions of animal body parts. This framework consists of a graphical interface for labeling of body parts and training the network. LEAP offers fast prediction on new data, and training with as few as 100 frames results in 95% of peak performance. We validated LEAP using videos of freely behaving fruit flies and tracked 32 distinct points to describe the pose of the head, body, wings and legs, with an error rate of <3% of body length. We recapitulated reported findings on insect gait dynamics and demonstrated LEAP’s applicability for unsupervised behavioral classification. Finally, we extended the method to more challenging imaging situations and videos of freely moving mice.
Recent work quantifying postural dynamics has attempted to define the repertoire of behaviors across the entire data set, we use a technique we refer to as cluster sampling. A simple random 117 subset of the movie images are grouped via k-means clustering and then these images are 118 sampled uniformly across groups for labeling. The grouping is based on linear correlations 119 between pixel intensities in the images as a proxy measure for similarity in body pose. The 120 diversity of poses represented using this method can be observed in the centroids of each of the 121 clusters identified (Supplementary Fig. 2). 122 123 Poses in each training image are labeled using a custom GUI with draggable body part markers 124 that form a skeleton (Fig. 1b). For the fruit fly, we track four points on each of the six legs, two 125 points on the wing tips, three points on the thorax and abdomen, and three points on the head 126 for a total of 32 points in every frame. These points were chosen to align with known Drosophila 127 body joints (Supplementary Fig. 3). For every training image, the user drags each skeleton 128 point to the appropriate body part and the program saves the label positions into a self-129 contained file. To enhance the size of the training image set further without the need for hand 130 labeling more frames, we augment the dataset by applying small random rotations and body-131 axis reflections to generate new samples from the labeled data. As the neural network 132 processes the raw images, the rotated and reflected images add new information that the 133 network can use during training. 134 135
Comprehensive descriptions of animal behavior require precise measurements of 3D whole-body movements. Although 2D approaches can track visible landmarks in restrictive environments, performance drops in freely moving animals, due to occlusions and appearance changes. Therefore, we designed DANNCE to robustly track anatomical landmarks in 3D across species and behaviors. DANNCE uses projective geometry to construct inputs to a convolutional neural network that leverages learned 3D geometric reasoning. We trained and benchmarked DANNCE using a 7-million frame dataset that relates color videos and rodent 3D poses. In rats and mice, DANNCE robustly tracked dozens of landmarks on the head, trunk, and limbs of freely moving ‡
Deciphering how brains generate behavior depends critically on an accurate description of behavior. If distinct behaviors are lumped together, separate modes of brain activity can be wrongly attributed to the same behavior. Alternatively, if a single behavior is split into two, the same neural activity can appear to produce different behaviors. Here, we address this issue in the context of acoustic communication in Drosophila. During courtship, males vibrate their wings to generate time-varying songs, and females evaluate songs to inform mating decisions. For 50 years, Drosophila melanogaster song was thought to consist of only two modes, sine and pulse, but using unsupervised classification methods on large datasets of song recordings, we now establish the existence of at least three song modes: two distinct pulse types, along with a single sine mode. We show how this seemingly subtle distinction affects our interpretation of the mechanisms underlying song production and perception. Specifically, we show that visual feedback influences the probability of producing each song mode and that male song mode choice affects female responses and contributes to modulating his song amplitude with distance. At the neural level, we demonstrate how the activity of four separate neuron types within the fly's song pathway differentially affects the probability of producing each song mode. Our results highlight the importance of carefully segmenting behavior to map the underlying sensory, neural, and genetic mechanisms.
Parallel developments in neuroscience and deep learning have led to mutually productive exchanges, pushing our understanding of real and artificial neural networks in sensory and cognitive systems. However, this interaction between fields is less developed in the study of motor control. In this work, we develop a virtual rodent as a platform for the grounded study of motor activity in artificial models of embodied control. We then use this platform to study motor activity across contexts by training a model to solve four complex tasks. Using methods familiar to neuroscientists, we describe the behavioral representations and algorithms employed by different layers of the network using a neuroethological approach to characterize motor activity relative to the rodent's behavior and goals. We find that the model uses two classes of representations which respectively encode the task-specific behavioral strategies and task-invariant behavioral kinematics. These representations are reflected in the sequential activity and population dynamics of neural subpopulations. Overall, the virtual rodent facilitates grounded collaborations between deep reinforcement learning and motor neuroscience.
Understanding the biological basis of social and collective behaviors in animals is a key goal of the life sciences, and may yield important insights for engineering intelligent multi-agent systems. A critical step in interrogating the mechanisms underlying social behaviors is a precise readout of the 3D pose of interacting animals. While approaches for multi-animal pose estimation are beginning to emerge, they remain challenging to compare due to the lack of standardized training and benchmark datasets. Here we introduce the PAIR-R24M (Paired Acquisition of Interacting oRganisms - Rat) dataset for multi-animal 3D pose estimation, which contains 24.3 million frames of RGB video and 3D ground-truth motion capture of dyadic interactions in laboratory rats. PAIR-R24M contains data from 18 distinct pairs of rats and 24 different viewpoints. We annotated the data with 11 behavioral labels and 3 interaction categories to facilitate benchmarking in rare but challenging behaviors. To establish a baseline for markerless multi-animal 3D pose estimation, we developed a multi-animal extension of DANNCE, a recently published network for 3D pose estimation in freely behaving laboratory animals. As the first large multi-animal 3D pose estimation dataset, PAIR-R24M will help advance 3D animal tracking approaches and aid in elucidating the neural basis of social behaviors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.