Nowadays, coaches and sport analyst are concerning about sport performance analysis through sport video match. However, they still used conventional method which is through manual observation of the full video that is very troublesome because they might miss some meaningful information presence in the video. Several previous studies have discussed about tracking ball movements, identification of player based on jersey color and number as well as player movement detection in various type of sport such as soccer and volleyball but not in badminton. Therefore, this study focused on developing an automated system using Faster Region Convolutional Neural Network (Faster R-CNN) to track the position of the badminton player from the sport broadcast video. In preparing the dataset for training and testing, several broadcast videos were converted into image frames before labelling the region which indicate the players. After that, several different trained Faster R-CNN detectors were produced from the dataset before tested with different set of videos to evaluate the detector performance. In evaluating the performance of each detector model, the average precision was obtained from precision recall graph. As a result, this study revealed that the detector successfully detects the player when the detector is being fed with more generalized dataset.
Sport is a competitive field, where it is an element of measurement for a countries development. Due to this reason, sport analysis has become one of the major contribution in analysing and improving the performance level of an athlete. Video-based modality has become a crucial tool used in sport analysis by coaches and performance analysis. There were wide variety of techniques used in sport video analysis. The main purpose of this review paper is to compare and update review between traditional handcrafted approach and deep learning approach in sport video analysis based on human activity recognition, overview of recent study in video based human activity recognition in sport analysis and finally concluded with future potential direction in sport video analysis.
<p class="Abstract">Sport performance analysis which is crucial in sport practice is used to improve the performance of athletes during the games. Many studies and investigation have been done in detecting different movements of player for notational analysis using either sensor based or video based modality. Recently, vision based modality has become the research interest due to the vast development of video transmission online. There are tremendous experimental studies have been done using vision based modality in sport but only a few review study has been done previously. Hence, we provide a review study on the video based technique to recognize sport action toward establishing the automated notational analysis system. The paper will be organized into four parts. Firstly, we provide an overview of the current existing technologies of the video based sports intelligence systems. Secondly, we review the framework of action recognition in all fields before we further discuss the implementation of deep learning in vision based modality for sport actions. Finally, the paper summarizes the further trend and research direction in action recognition for sports using video approach. We believed that this review study would be very beneficial in providing a complete overview on video based action recognition in sports.</p>
Automated action recognition is useful for improving the performance of the athletes through notational analysis. The notational analysis is usually used by the coach or notational analyst to study the movement patterns, strategy and tactics. Therefore, action recognition is the main key before further analysis can be done. This paper focused on developing an automated badminton action recognition using vision based dataset. 1496 badminton match image frames of 5 actions were studied – smash, clear, drop, net shot and lift. At first, the dataset was classified into 0.8:0.2 for training and testing the classification task by machine learning. Secondly, features of the training dataset were extracted using the Alexnet Convolutional Neural Network (CNN) model. In extracting the features, we introduced the new local feature extractor technique that extracts features at the fc8 layer. After collecting all the features at the fc8 layer, features were being classified by using machine learning classifier which is linear Support Vector Machine (SVM). The experiment was repeated using a normal global feature extractor technique. Lastly, both of the new local and global feature extractor techniques were repeated using GoogleNet CNN model to compare the performance between AlexNet and GoogleNet model. The results show that the new local feature extractor using AlexNet CNN model has the best performance accuracy which is 82.0%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.