2023
DOI: 10.48550/arxiv.2301.06187
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

CNN-Based Action Recognition and Pose Estimation for Classifying Animal Behavior from Videos: A Survey

Abstract: Classifying the behavior of humans or animals from videos is important in biomedical fields for understanding brain function and response to stimuli. Action recognition, classifying activities performed by one or more subjects in a trimmed video, forms the basis of many of these techniques. Deep learning models for human action recognition have progressed significantly over the last decade. Recently, there is an increased interest in research that incorporates deep learning-based action recognition for animal … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 122 publications
(251 reference statements)
0
1
0
Order By: Relevance
“…Supervised methods that have been applied are conventional methods as bag-of-words (Dollár et al, 2005), Bayesian classification (Dam et al, 2013) or tree-based classifiers used in MARS (Segalin et al, 2021) and SimBA (Nilsson et al, 2020). Perez and Toler-Franklin (2023) provide an overview of CNN-based approaches, such as 2D, Two Stream networks and 3D-CNNs, often combined with recurrent head to model the temporal dependencies. In recent years, major advances in deep learning classification are made using Transformer architectures that are designed to pick up the most relevant context without constraints on how far away that context is.…”
Section: Related Work Supervised Behavior Recognitionmentioning
confidence: 99%
“…Supervised methods that have been applied are conventional methods as bag-of-words (Dollár et al, 2005), Bayesian classification (Dam et al, 2013) or tree-based classifiers used in MARS (Segalin et al, 2021) and SimBA (Nilsson et al, 2020). Perez and Toler-Franklin (2023) provide an overview of CNN-based approaches, such as 2D, Two Stream networks and 3D-CNNs, often combined with recurrent head to model the temporal dependencies. In recent years, major advances in deep learning classification are made using Transformer architectures that are designed to pick up the most relevant context without constraints on how far away that context is.…”
Section: Related Work Supervised Behavior Recognitionmentioning
confidence: 99%
“…Conversely, bottom-up methods directly detect all skeletal keypoints in the image, demanding precise joint localization and accurate skeleton information retrieval. Animal pose estimation holds significant importance and practical value in animal behavior research, animal conservation, agricultural production, animal health monitoring, and more [2][3][4].…”
Section: Introductionmentioning
confidence: 99%