Deep learning frameworks utilizing convolutional neural networks (CNNs) have frequently been used for brain age prediction and have achieved outstanding performance. Nevertheless, deep learning remains a black box as it is hard to interpret which brain parts contribute significantly to the predictions. To tackle this challenge, we first trained a lightweight, fully CNN model for brain age estimation on a large sample data set (N = 3054, age range = [8,80 years]) and tested it on an independent data set (N = 555, mean absolute error (MAE) = 4.45 years, r = 0.96). We then developed an interpretable scheme combining network occlusion sensitivity analysis (NOSA) with a fine-grained human brain atlas to uncover the learned invariance of the model. Our findings show that the dorsolateral, dorsomedial frontal cortex, anterior cingulate cortex, and thalamus had the highest contributions to age prediction across the lifespan. More interestingly, we observed that different regions showed divergent patterns in their predictions for specific age groups and that the bilateral hemispheres contributed differently to the predictions. Regions in the frontal lobe were essential predictors in both the developmental and aging stages with the thalamus remaining relatively stable and saliently correlated with other regional changes throughout the lifespan. The lateral and medial temporal brain regions gradually became involved during the aging phase. At the network level, the frontoparietal and the default mode networks show an inverted U-shape contribution from the developmental to the aging stages. The framework could identify regional contributions to the brain age prediction model, which could help increase the model interpretability when serving as an aging biomarker.
Animal pose estimation has important value in both theoretical research and practical applications, such as zoology and wildlife conservation. A simple but effective high-resolution Transformer model for animal pose estimation called DepthFormer is provided in this study to address the issue of large-scale models for multi-animal pose estimation being problematic with limited computing resources. We make good use of a multi-branch parallel design that can maintain high-resolution representations throughout the process. Along with two similarities, i.e., sparse connectivity and weight sharing between self-attention and depthwise convolution, we utilize the delicate structure of the Transformer and representative batch normalization to design a new basic block for reducing the number of parameters and the amount of computation required. In addition, four PoolFormer blocks are introduced after the parallel network to maintain good performance. Benchmark evaluation is performed on a public database named AP-10K, which contains 23 animal families and 54 species, and the results are compared with the other six state-of-the-art pose estimation networks. The results demonstrate that the performance of DepthFormer surpasses that of other popular lightweight networks (e.g., Lite-HRNet and HRFormer-Tiny) when performing this task. This work can provide effective technical support to accurately estimate animal poses with limited computing resources.
Macaque monkey is a rare substitute which plays an important role for human beings in relation to psychological and spiritual science research. It is essential for these studies to accurately estimate the pose information of macaque monkeys. Many large-scale models have achieved state-of-the-art results in pose macaque estimation. However, it is difficult to deploy when computing resources are limited. Combining the structure of high-resolution network and the design principle of light-weight network, we propose the attention-refined light-weight high-resolution network for macaque monkey pose estimation (HR-MPE). The multi-branch parallel structure is adopted to maintain high-resolution representation throughout the process. Moreover, a novel basic block is designed by a powerful transformer structure and polarized self-attention, where there is a simple structure and fewer parameters. Two attention refined blocks are added at the end of the parallel structure, which are composed of light-weight asymmetric convolutions and a triplet attention with almost no parameter, obtaining richer representation information. An unbiased data processing method is also utilized to obtain an accurate flipping result. The experiment is conducted on a macaque dataset containing more than 13,000 pictures. Our network has reached a 77.0 AP score, surpassing HRFormer with fewer parameters by 1.8 AP.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.