2021
DOI: 10.1007/s10489-021-02446-8
|View full text |Cite
|
Sign up to set email alerts
|

Joint pyramid attention network for real-time semantic segmentation of urban scenes

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
22
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 43 publications
(22 citation statements)
references
References 38 publications
0
22
0
Order By: Relevance
“…Compared with Fig. 11 (b) and (c), the loss of the acoustic features with postprocessing is much less than that without postprocessing, which proves the effectiveness of the postprocessing module [46]. Figure 11 (d) proves that the model has learned how many time steps should stop generating the predicted acoustic features.…”
Section: Methodsmentioning
confidence: 78%
See 1 more Smart Citation
“…Compared with Fig. 11 (b) and (c), the loss of the acoustic features with postprocessing is much less than that without postprocessing, which proves the effectiveness of the postprocessing module [46]. Figure 11 (d) proves that the model has learned how many time steps should stop generating the predicted acoustic features.…”
Section: Methodsmentioning
confidence: 78%
“…After about 200k iteration, the loss curve converges at a very low value, which proves a satisfying performance of the model. Compared with Fig.11(b) and (c), the loss of the acoustic features with postprocessing is much less than that without postprocessing, which proves the effectiveness of the postprocessing module[46]. Figure11 (d) proves that the model has learned how many time steps should stop generating the predicted acoustic features.To evaluate the sound quality of synthesized speeches, objective and subject test are conducted.…”
mentioning
confidence: 76%
“…Presently, the residual structure, as illustrated in Figure 2. (a), from ResNet [27] is widely used to reduce parameters and boost inference. ERFNet [16] factorized convolution effectively and realized the lightweight of the network, as depicted in Figure 2.(b).…”
Section: Multi-scale Information Interaction Modulementioning
confidence: 99%
“…There are many works on fusing multi-scale features. Hu et al [27] propose a joint feature pyramid (JFP) module, and built a spatial detail extraction (SDE) module, design a bilateral feature fusion (BFF) module, making full use of the correspondence between high-level features and lowlevel features. Benvcevic et al [28] propose training a neural network on polar transformations of the original dataset, such that the polar origin for the transformation is the center point of the object.…”
Section: A Related Workmentioning
confidence: 99%