Diogo C. Luvizon scite author profile

Action recognition and human pose estimation are closely related but both problems are generally handled as distinct tasks in the literature. In this work, we propose a multitask framework for jointly 2D and 3D pose estimation from still images and human action recognition from video sequences. We show that a single architecture can be used to solve the two problems in an efficient way and still achieves state-of-the-art results. Additionally, we demonstrate that optimization from end-toend leads to significantly higher accuracy than separated learning. The proposed architecture can be trained with data from different categories simultaneously in a seamlessly way. The reported results on four datasets (MPII, Human3.6M, Penn Action and NTU) demonstrate the effectiveness of our method on the targeted tasks.

show abstract

A Video-Based System for Vehicle Speed Measurement in Urban Roadways

Luvizon

Nassu

Minetto

2016

IEEE Trans. Intell. Transport. Syst.

View full text Add to dashboard Cite

Human pose regression by combining indirect part detection and contextual information

Luvizon

Tabia

Picard

2019

Computers & Graphics

164

View full text Add to dashboard Cite

In this paper, we propose an end-to-end trainable regression approach for human pose estimation from still images. We use the proposed Soft-argmax function to convert feature maps directly to joint coordinates, resulting in a fully differentiable framework. Our method is able to learn heat maps representations indirectly, without additional steps of artificial ground truth generation. Consequently, contextual information can be included to the pose predictions in a seamless way. We evaluated our method on two very challenging datasets, the Leeds Sports Poses (LSP) and the MPII Human Pose datasets, reaching the best performance among all the existing regression methods and comparable results to the state-of-the-art detection based approaches.

show abstract

Multi-task Deep Learning for Real-Time 3D Human Pose Estimation and Action Recognition

Luvizon

Picard

Tabia

2020

IEEE Trans. Pattern Anal. Mach. Intell.

View full text Add to dashboard Cite

Human pose estimation and action recognition are related tasks since both problems are strongly dependent on the human body representation and analysis. Nonetheless, most recent methods in the literature handle the two problems separately. In this work, we propose a multi-task framework for jointly estimating 2D or 3D human poses from monocular color images and classifying human actions from video sequences. We show that a single architecture can be used to solve both problems in an efficient way and still achieves state-of-the-art or comparable results at each task while running with a throughput of more than 100 frames per second. The proposed method benefits from high parameters sharing between the two tasks by unifying still images and video clips processing in a single pipeline, allowing the model to be trained with data from different categories simultaneously and in a seamlessly way. Additionally, we provide important insights for end-to-end training the proposed multi-task model by decoupling key prediction parts, which consistently leads to better accuracy on both tasks. The reported results on four datasets (MPII, Human3.6M, Penn Action and NTU RGB+D) demonstrate the effectiveness of our method on the targeted tasks. Our source code and trained weights are publicly available at

show abstract

Vehicle speed estimation by license plate detection and tracking

Luvizon

Nassu

Minetto

2014

View full text Add to dashboard Cite

Learning features combination for human action recognition from skeleton sequences

Luvizon

Tabia

Picard

2017

Pattern Recognition Letters

View full text Add to dashboard Cite

Human action recognition is a challenging task due to the complexity of human movements and to the variety among the same actions performed by distinct subjects. Recent technologies provide the skeletal representation of human body extracted in real time from depth maps, which is a high discriminant information for efficient action recognition. In this context, we present a new framework for human action recognition from skeleton sequences. We propose extracting sets of spatial and temporal local features from subgroups of joints, which are aggregated by a robust method based on the VLAD algorithm and a pool of clusters. Several feature vectors are then combined by a metric learning method inspired by the LMNN algorithm with the objective to improve the classification accuracy using the nonparametric k-NN classifier. We evaluated our method on three public datasets, including the MSR-Action3D, the UTKinect-Action3D, and the Florence 3D Actions dataset. As a result, the proposed framework performance overcomes the methods in the state of the art on all the experiments.

show abstract

Human Pose Regression by Combining Indirect Part Detection and Contextual Information

Luvizon¹,

Tabia²,

Picard³

2017

Preprint

View full text Add to dashboard Cite

Consensus-Based Optimization for 3D Human Pose Estimation in Camera Coordinates

2022

View full text Add to dashboard Cite

12 3 4

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Diogo C. Luvizon

2D/3D Pose Estimation and Action Recognition Using Multitask Deep Learning

A Video-Based System for Vehicle Speed Measurement in Urban Roadways

Human pose regression by combining indirect part detection and contextual information

Multi-task Deep Learning for Real-Time 3D Human Pose Estimation and Action Recognition

Vehicle speed estimation by license plate detection and tracking

Learning features combination for human action recognition from skeleton sequences

Human Pose Regression by Combining Indirect Part Detection and Contextual Information

Consensus-Based Optimization for 3D Human Pose Estimation in Camera Coordinates

Contact Info

Product

Resources

About