Xiang Li scite author profile

In standard Convolutional Neural Networks (CNNs), the receptive fields of artificial neurons in each layer are designed to share the same size. It is well-known in the neuroscience community that the receptive field size of visual cortical neurons are modulated by the stimulus, which has been rarely considered in constructing CNNs. We propose a dynamic selection mechanism in CNNs that allows each neuron to adaptively adjust its receptive field size based on multiple scales of input information. A building block called Selective Kernel (SK) unit is designed, in which multiple branches with different kernel sizes are fused using softmax attention that is guided by the information in these branches. Different attentions on these branches yield different sizes of the effective receptive fields of neurons in the fusion layer. Multiple SK units are stacked to a deep network termed Selective Kernel Networks (SKNets). On the ImageNet and CIFAR benchmarks, we empirically show that SKNet outperforms the existing state-of-the-art architectures with lower model complexity. Detailed analyses show that the neurons in SKNet can capture target objects with different scales, which verifies the capability of neurons for adaptively adjusting their receptive field sizes according to the input. The code and models are available at https://github.com/implus

show abstract

Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions

Wang

Xie

et al. 2021

Preprint

215

472

View full text Add to dashboard Cite

Deep Learning-Based Image Segmentation on Multimodal Medical Imaging

Guo

Huang

et al. 2019

IEEE Trans. Radiat. Plasma Med. Sci.

289

144

View full text Add to dashboard Cite

Multi-modality medical imaging techniques have been increasingly applied in clinical practice and research studies. Corresponding multi-modal image analysis and ensemble learning schemes have seen rapid growth and bring unique value to medical applications. Motivated by the recent success of applying deep learning methods to medical image processing, we first propose an algorithmic architecture for supervised multi-modal image analysis with cross-modality fusion at the feature learning level, classifier level, and decision-making level. We then design and implement an image segmentation system based on deep Convolutional Neural Networks (CNN) to contour the lesions of soft tissue sarcomas using multi-modal images, including those from Magnetic Resonance Imaging (MRI), Computed Tomography (CT) and Positron Emission Tomography (PET). The network trained with multi-modal images shows superior performance compared to networks trained with single-modal images. For the task of tumor segmentation, performing image fusion within the network (i.e. fusing at convolutional or fully connected layers) is generally better than fusing images at the network output (i.e. voting). This study provides empirical guidance for the design and application of multi-modal image analysis.

show abstract

Disentangling User Interest and Conformity for Recommendation with Causal Embedding

et al. 2021

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Xiang Li

Selective Kernel Networks

Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions

Deep Learning-Based Image Segmentation on Multimodal Medical Imaging

Disentangling User Interest and Conformity for Recommendation with Causal Embedding

Contact Info

Product

Resources

About