2019
DOI: 10.48550/arxiv.1908.05900
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 46 publications
0
3
0
Order By: Relevance
“…The CBS structure consists of the convolutional layers, batch normalization (BN), and SiLU [33] activation function, the SPPF structure transmits the input information through several MaxPool layers which sizes are 5×5 sequentially. A feature pyramid structure based on PANet [34] is used in the neck subnetwork, which locates between the detection subnetwork and backbone subnetwork. In this layer, strong semantic characteristics are conveyed from the top down while strong location features are transmitted from the bottom up.…”
Section: Methodsmentioning
confidence: 99%
“…The CBS structure consists of the convolutional layers, batch normalization (BN), and SiLU [33] activation function, the SPPF structure transmits the input information through several MaxPool layers which sizes are 5×5 sequentially. A feature pyramid structure based on PANet [34] is used in the neck subnetwork, which locates between the detection subnetwork and backbone subnetwork. In this layer, strong semantic characteristics are conveyed from the top down while strong location features are transmitted from the bottom up.…”
Section: Methodsmentioning
confidence: 99%
“…(2) YOLOv5s neck network The neck network of YOLOv5s follows the structure of a feature pyramid network (FPN) [54] and path aggregation network (PAN) [55], as shown in figure 4, and is mainly composed of convolutional, upsampling, and concat layers, as well as a C3 module. FPN constructs feature pyramids of all scales using a top-down process.…”
Section: Datasetsmentioning
confidence: 99%
“…Neck: The Neck component is in charge of fusing the backbone features and improving their representational power. For its Neck section, YOLOv5 employs the Feature Pyramid Network (FPN [28]) and Pixel Aggregation Network (PAN [29]) structures. The FPN structure transmits semantic information from top to bottom via an upsampling operation, while the PAN structure transmits location information from bottom-to-top via a downsampling operation.…”
Section: Yolov5mentioning
confidence: 99%