Spatio-temporal representations in frame sequences play an important role in the task of action recognition. Previously, a method of using optical flow as a temporal information in combination with a set of RGB images that contain spatial information has shown great performance enhancement in the action recognition tasks. However, it has an expensive computational cost and requires two-stream (RGB and optical flow) framework. In this paper, we propose MFNet (Motion Feature Network) containing motion blocks which make it possible to encode spatiotemporal information between adjacent frames in a unified network that can be trained end-to-end. The motion block can be attached to any existing CNN-based action recognition frameworks with only a small additional cost. We evaluated our network on two of the action recognition datasets (Jester and Something-Something) and achieved competitive performances for both datasets by training the networks from scratch.
Deep learning technology has rapidly evolved in recent years. Bone age assessment (BAA) is a typical object detection and classification problem that would benefit from deep learning. Convolutional neural networks (CNNs) and their variants are hence increasingly used for automating BAA, and they have shown promising results. In this paper, we propose a complete end-to-end BAA system to automate the entire process of the Tanner-Whitehouse 3 method, starting from localization of the epiphysis-metaphysis growth regions within 13 different bones and ending with estimation of the corresponding BA. Specific modifications to the CNNs and other stages are proposed to improve results. In addition, an annotated database of 3300 X-ray images is built to train and evaluate the system. The experimental results show that the average top-1 and top-2 prediction accuracies for skeletal bone maturity levels for 13 regions of interest are 79.6% and 97.2%, respectively. The mean absolute error and root mean squared error in age prediction are 0.46 years and 0.62 years, respectively, and accuracy within one year of the ground truth of 97.6% is achieved. The proposed system is shown to outperform a commercially available Greulich-Pyle-based system, demonstrating the potential for practical clinical use. INDEX TERMS Bone age assessment, deep learning, GP, TW3.
In this paper, we propose an effective online method to recognize handwritten music symbols. Based on the fact that most music symbols can be regarded as combinations of several basic strokes, the proposed method first classifies all the strokes comprising an input symbol and then recognizes the symbol based on the results of stroke classification. For stroke classification, we propose to use three types of features, which are the size information, the histogram of directional movement angles, and the histogram of undirected movement angles. When combining classified strokes into a music symbol, we utilize their sizes and spatial relation together with their combination. The proposed method is evaluated using two datasets including HOMUS, one of the largest music symbol datasets. As a result, it achieves a significant improvements of about 10% in recognition rates compared to the state-of-the-art method for the datasets. This shows the superiority of the proposed method in online handwritten music symbol recognition. B Nojun Kwak
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.