2020
DOI: 10.1109/access.2020.2990700
|View full text |Cite
|
Sign up to set email alerts
|

Multi-Lane Capsule Network for Classifying Images With Complex Background

Abstract: Capsule Network (CapsNet) is a novel structure for deep neural network, mapping the region of target instance to vectors and matrices rather than scalars. This process is enabled by dynamic routing algorithm which help CapsNet to achieve more robust capacity with fewer parameters than traditional CNNs. However, one drawback of Capsule is that it turns to account for everything in the image, which leads to a poor performance when the backgrounds are too varied to model with a reasonable sized net. We proposed a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
22
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
4
1

Relationship

3
7

Authors

Journals

citations
Cited by 50 publications
(22 citation statements)
references
References 23 publications
0
22
0
Order By: Relevance
“…We added a vertical anchor regression mechanism to CTPN to detect text lines, and at the same time use PVANet's image feature extraction module to optimize and accelerate the detection network [12][13]. Finally, this paper also proposes a literacy model based on spatial pyramid pooling and fusion of Chinese character skeleton features, which reduces the loss of image information caused by fixed input image size through spatial pyramid pooling; improves the accuracy of Chinese character recognition by adding Chinese character skeleton features [14][15][16].…”
Section: Methodsmentioning
confidence: 99%
“…We added a vertical anchor regression mechanism to CTPN to detect text lines, and at the same time use PVANet's image feature extraction module to optimize and accelerate the detection network [12][13]. Finally, this paper also proposes a literacy model based on spatial pyramid pooling and fusion of Chinese character skeleton features, which reduces the loss of image information caused by fixed input image size through spatial pyramid pooling; improves the accuracy of Chinese character recognition by adding Chinese character skeleton features [14][15][16].…”
Section: Methodsmentioning
confidence: 99%
“…When related entities are mentioned in the article, the semantic vector is used as external knowledge for direct accumulation, leading to deficiencies in both knowledge extraction and knowledge fusion. To take full advantage of facts in KGs which contains rich language patterns, we propose a novel module CapsBERT as a preprocessing step, which fine-tunes pre-trained BERT [36] and utilizes a capsule neural network (CapsNet) [37] for knowledge graph representation. BERT is a state-of-the-art contextual language representation model built on a multilayer bidirectional Transformer encoder, which is able to capture complex linguistic phenomena.…”
Section: Methodsmentioning
confidence: 99%
“…Equation Squash [14] v j � (‖s j ‖ 2 /(1 + ‖s j ‖ 2 )(s j /‖s j ‖)) HSquash [22] v j � ‖s j /4‖ 2 /1 + ‖s j /4‖ 2 s j /‖s j ‖ Strict-squash [23] v j � 0.69‖s j ‖ 2 * 2 − 0.6‖s j ‖− 1.115 Squash-4 [24] v j � ‖s j ‖ 2 /0.5 + ‖s j ‖ 2 s j /‖s j ‖ e-Squash…”
Section: Squash Functionmentioning
confidence: 99%