Three-Stream 3D/1D CNN for Fine-Grained Action Classification and Segmentation in Table Tennis

Martin, Pierre‐Etienne; Benois‐Pineau, Jenny; Péteri, Renaud; Morlier, Julien

doi:10.1145/3475722.3482793

Cited by 8 publications

(3 citation statements)

References 32 publications

(36 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The attention blocks show a 5% increase in the classification accuracies compared to their baseline models. Pierre-Etienne Martine et al [15] have also introduced a three-stream (RGB, optical flow, and pose estimation) 3D/1D CNN model for stroke classification and detection tasks.…”

Section: Related Workmentioning

confidence: 99%

Table Tennis Stroke Detection and Recognition Using Ball Trajectory Data

Kulkarni¹,

Jamadagni²,

Paul³

et al. 2022

SSRN Journal

View full text Add to dashboard Cite

Section: Related Workmentioning

confidence: 99%

Table Tennis Stroke Detection and Recognition Using Ball Trajectory Data

Kulkarni¹,

Jamadagni²,

Paul³

et al. 2022

SSRN Journal

View full text Add to dashboard Cite

“…Recently, researchers pay much attention to sports videos to address the challenges, including building datasets [12,20,24,35,42] and proposing new models [23,36,48]. In this paper, we focus on a specific sport -table tennis and propose a dataset -Ping Pang Action (P 2 A) for action recognition and localization to facilitate researches on fine-grained action understanding.…”

Section: Introductionmentioning

confidence: 99%

$\textbf{P$^2$A}$: A Dataset and Benchmark for Dense Action Detection from Table Tennis Match Broadcasting Videos

Bian¹,

Wang²,

Xiong³

et al. 2022

Preprint

View full text Add to dashboard Cite

While deep learning has been widely used for video analytics, such as video classification and action detection, dense action detection with fast-moving subjects from sports videos is still challenging. In this work, we release yet another sports video dataset P 2 A for Ping Pong-Action detection, which consists of 2,721 video clips collected from the broadcasting videos of professional table tennis matches in World Table Tennis Championships and Olympiads. We work with a crew of table tennis professionals and referees to obtain fine-grained action labels (in 14 classes) for every ping-pong action appeared in the dataset, and formulate two sets of action detection problems-action localization and action recognition. We evaluate a number of commonly-seen action recognition (e.g., TSM, TSN, Video SwinTransformer, and Slowfast) and action localization models (e.g., BSN, BSN++, BMN, TCANet), using P 2 A for both problems, under various settings. These models can only achieve 48% area under the AR-AN curve for localization and 82% top-one accuracy for recognition, since the ping-pong actions are dense with fast-moving subjects but broadcasting videos are with only 25 FPS. The results confirm that P 2 A is still a challenging task and can be used as a benchmark for action detection from videos. CCS CONCEPTS• Computing methodologies → Activity recognition and understanding; Video segmentation.

show abstract

“…This representation can help to reduce the gap between the real world and the synthetic data. Lastly, human pose estimation, is considered as articulated object pose estimation, is of importance in various computer vision and robotic tasks such as action recognition [12].…”

Section: Introductionmentioning

confidence: 99%

Review on 6D Object Pose Estimation With the Focus on Indoor Scene Understanding

Nejatishahidin¹,

Pooya²

2022

AAIML

View full text Add to dashboard Cite

6D object pose estimation problem has been extensively studied in the field of Computer Vision and Robotics. It has wide range of applications such as robot manipulation, augmented reality, and 3D scene understanding. With the advent of Deep Learning, many breakthroughs have been made; however, approaches continue to struggle when they encounter unseen instances, new categories, or real-world challenges such as cluttered backgrounds and occlusions. In this study, we will explore the available methods based on input modality, problem formulation, and whether it is a category-level or instance-level approach. As a part of our discussion, we will focus on how 6D object pose estimation can be used for understanding 3D scenes.

show abstract

Three-Stream 3D/1D CNN for Fine-Grained Action Classification and Segmentation in Table Tennis

Abstract: Figure 1: Frames of an "Offensive Forehand Hit" stroke from TTStroke-21 with its estimated pose and optical flow.

Cited by 8 publications

References 32 publications

Table Tennis Stroke Detection and Recognition Using Ball Trajectory Data

Table Tennis Stroke Detection and Recognition Using Ball Trajectory Data

$\textbf{P$^2$A}$: A Dataset and Benchmark for Dense Action Detection from Table Tennis Match Broadcasting Videos

Review on 6D Object Pose Estimation With the Focus on Indoor Scene Understanding

Contact Info

Product

Resources

About