At present, the target tracking method based on the correlation operation mainly uses deep learning to extract spatial information from video frames and then performs correlations on this basis. However, it does not extract the motion features of tracking targets on the time axis, and thus tracked targets can be easily lost when occlusion occurs. To this end, a spatiotemporal motion target tracking model incorporating Kalman filtering is proposed with the aim of alleviating the problem of occlusion in the tracking process. In combination with the segmentation model, a suitable model is selected by scores to predict or detect the current state of the target. We use an elliptic fitting strategy to evaluate the bounding boxes online. Experiments demonstrate that our approach performs well and is stable in the face of multiple challenges (such as occlusion) on the VOT2016 and VOT2018 datasets with guaranteed real-time algorithm performance.
Aiming at the fact that the current graph convolution operation based on skeleton graph is limited in local adjacent nodes, or the overall relative position information of skeleton is omitted, an enhancement method of joint point position information based on relative position encoding of skeleton is proposed. The proposed method takes the central joint point of the human trunk as the root node, and all joint nodes form a tree structure according to the natural connection of the body, and the code of each joint node inherits the code of its parent node and also includes its own number in the sibling node. In addition, considering that the number of channels in the graph-based convolutional network model is generally larger, and the channel information itself has a strong correlation, the channel information frequency division and recombination operation is proposed to reflect the difference of information in different frequency bands in the channel. Experiments show that the proposed method has a certain effect on improving the effect of the embedded model.
Medical image segmentation plays a vital role in computer‐aided diagnosis and intelligent medical treatment. It can preprocess medical images to help doctors better diagnose diseases. Class activation map (CAM) is an important technology in weakly supervised segmentation, which can achieve image segmentation without pixel‐level label training. This technology can well meet the needs of medical image segmentation. However, CAM obtaining is still unperfect due to global average pooling (GAP). GAP will cause important and non‐important regions to be given equal attention during the training process. So, CAM cannot demarcate the boundary of the target regions well. In order to solve this problem, a global weighted average pooling network fusing the grayscale information of medical images is proposed. The proposed network can solve the problem that GAP has the same concern for important regions and non‐important regions of the feature map, because the different weights can be learned for different positions of the feature map before the GAP in the proposed model. At the same time, because of the grayscale difference between the tumor area and the non‐tumor area in the brain tumor image, the low‐level grayscale information of the medical image is fused with the high‐level semantic information extracted by the network to learn the weights. This operation gives full play to the advantages of feature maps of different levels. The experiment results on the popular medical image dataset BraTS2019 show that the proposed method can well improve the performance of CAM and help CAM fit the boundaries of objects. Meanwhile, in the DSC evaluation, the proposed method achieves a score of 64.1%, which is a 4.6% improvement over a recent research method.
Background Retinal vessel segmentation provides an important basis for determining the geometric characteristics of retinal vessels and the diagnosis of related diseases. The retinal vessels are mainly composed of coarse vessels and fine vessels, and the vessels have the problem of uneven distribution of coarse and fine vessels. At present, the common retinal blood vessel segmentation network based on deep learning can easily extract coarse vessels, but it ignores the more difficult to extract fine vessels. Methods Scale-aware dense residual model, multi-output weighted loss and attention mechanism are proposed and incorporated into the U-shape network. The model is proposed to extract image features through residual module, and using a multi-scale feature aggregation method to extract the deep information of the network after the last encoder layer, and upsampling output at each decoder layer, compare the output results of each decoder layer with the ground truth separately to obtain multiple output losses, and the last layer of the decoder layers is used as the final prediction output. Result The proposed network is tested on DRIVE and STARE. The evaluation indicators used in this paper are dice, accuracy, mIoU and recall rate. On the DRIVE dataset, the four indicators are respectively 80.40%, 96.67%, 82.14% and 88.10%; on the STARE dataset, the four indicators are respectively 83.41%, 97.39%, 84.38% and 88.84%. Conclusion The experiment result proves that the network in this paper has better performance, can extract more continuous fine vessels, and reduces the problem of missing segmentation and false segmentation to a certain extent.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.