Magnetic resonance imaging (MRI) can indirectly reflect microscopic changes in lesions on the spinal cord; however, the application of deep learning to MRI to classify and detect lesions for cervical spinal cord diseases has not been sufficiently explored. In this study, we implemented a deep neural network for MRI to detect lesions caused by cervical diseases. We retrospectively reviewed the MRI of 1,500 patients irrespective of whether they had cervical diseases. The patients were treated in our hospital from January 2013 to December 2018. We randomly divided the MRI data into three groups of datasets: disc group (800 datasets), injured group (200 datasets), and normal group (500 datasets). We designed the relevant parameters and used a faster‐region convolutional neural network (Faster R‐CNN) combined with a backbone convolutional feature extractor using the ResNet‐50 and VGG‐16 networks, to detect lesions during MRI. Experimental results showed that the prediction accuracy and speed of Faster R‐CNN with ResNet‐50 and VGG‐16 in detecting and recognizing lesions from a cervical spinal cord MRI were satisfactory. The mean average precisions (mAPs) for Faster R‐CNN with ResNet‐50 and VGG‐16 were 88.6 and 72.3%, respectively, and the testing times was 0.22 and 0.24 s/image, respectively. Faster R‐CNN can identify and detect lesions from cervical MRIs. To some extent, it may aid radiologists and spine surgeons in their diagnoses. The results of our study can provide motivation for future research to combine medical imaging and deep learning.
Two-stream network architecture has the ability to capture temporal and spatial features from videos simultaneously and has achieved excellent performance on video action recognition tasks. However, there is a fair amount of redundant information in both temporal and spatial dimensions in videos, which increases the complexity of network learning. To solve this problem, we propose residual spatial-temporal attention network (R-STAN), a feed-forward convolutional neural network using residual learning and spatial-temporal attention mechanism for video action recognition, which makes the network focus more on discriminative temporal and spatial features. In our R-STAN, each stream is constructed by stacking residual spatial-temporal attention blocks (R-STAB), the spatial-temporal attention modules integrated in the residual blocks have the ability to generate attention-aware features along temporal and spatial dimensions, which largely reduce the redundant information. Together with the specific characteristic of residual learning, we are able to construct a very deep network for learning spatial-temporal information in videos. With the layers going deeper, the attention-aware features from the different R-STABs can change adaptively. We validate our R-STAN through a large number of experiments on UCF101 and HMDB51 datasets. Our experiments show that our proposed network combined with residual learning and spatial-temporal attention mechanism contributes substantially to the performance of video action recognition.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.