In autonomous driving and Intelligent transportation systems, pedestrian detection is vital in reducing traffic accidents. However, detecting small-scale and occluded pedestrians is challenging due to the ineffective utilization of the low-feature content of small-scale objects. The main reasons behind this are the stochastic nature of weight initialization and the greedy nature of nonmaximum suppression. To overcome the aforesaid issues, this work proposes a multifocus feature extractor module by fusing feature maps extracted from the Gaussian and Xavier mapping function to enhance the effective receptive field. We also employ a focused attention feature selection on a higher layer feature map of the single shot detector (SSD) region proposal module to blend with its low-layer feature to tackle the vanishing of the feature detail due to convolution and pooling operation. In addition, this work proposes a decaying nonmaximum suppression function considering score and Intersection Over Union (IOU) parameters to tackle high miss rates caused by greedy nonmaximum suppression used by SSD. Extensive experiments have been conducted on the Caltech pedestrian dataset with the original annotations and the improved annotations. Experimental results demonstrate the effectiveness of the proposed method, particularly for small and occluded pedestrians.
The estimation of crowd density is crucial for applications such as autonomous driving, visual surveillance, crowd control, public space planning, and warning visually distracted drivers prior to an accident. Having strong translational, reflective, and scale symmetry, models for estimating the density of a crowd yield an encouraging result. However, dynamic scenes with perspective distortions and rapidly changing spatial and temporal domains still present obstacles. The main reasons for this are the dynamic nature of a scene and the difficulty of representing and incorporating the feature space of objects of varying sizes into a prediction model. To overcome the aforementioned issues, this paper proposes a parallel multi-size receptive field units framework that leverages the majority of the CNN layer’s features, allowing for the representation and participation in the model prediction of the features of objects of all sizes. The proposed method utilizes features generated from lower to higher layers. As a result, different object scales can be handled at different framework depths, and various environmental densities can be estimated. However, the inclusion of the vast majority of layer features in the prediction model has a number of negative effects on the prediction’s outcome. Asymmetric non-local attention and the channel weighting module of a feature map are proposed to handle noise and background details and re-weight each channel to make it more sensitive to important features while ignoring irrelevant ones, respectively. While the output predictions of some layers have high bias and low variance, those of other layers have low bias and high variance. Using stack ensemble meta-learning, we combine individual predictions made with lower-layer features and higher-layer features to improve prediction while balancing the tradeoff between bias and variance. The UCF CC 50 dataset and the ShanghaiTech dataset have both been subjected to extensive testing. The results of the experiments indicate that the proposed method is effective for dense distributions and objects of various sizes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.