The multi-scale object detection, especially small object detection, is still a challenging task. This paper proposes an improved multi-scale object detection network based on single shot multibox detector (SSD), and the network is named as SSD-MSN. The SSD-MSN can learn more rich features of small objects from the enlarged areas, which are clipped from the raw image. The extra features are contributed to improving detection performance. The SSD-MSN includes two subnets: area proposal network (APN) and multi-scale object detection network, namely SSD detector. The APN is used to select the area proposals containing one or more objects from clipped areas. The SSD detector is used to predict the classification and location of objects from raw image and area proposals. Besides, a valid dividing image strategy is introduced in this paper, which can generate 3*3 clipped areas from the raw image. The strategy not only generates more area proposals but also ensures more objects can be contained in each clipped area. It plays the role of data augmentation, which is critical to detection performance. The experiment results on PASCAL VOC and COCO show that SSD-MSN achieves state-of-the-art detection performance and improves the multi-scale object detection performance effectively. INDEX TERMS Multi-scale object detection, area proposal network, SSD, dividing image strategy.
The research on malware detection enabled by deep learning has become a hot issue in the field of network security. The existing malware detection methods based on deep learning suffer from some issues, such as weak ability of deep feature extraction, relatively complex model, and insufficient ability of model generalization. Traditional deep learning architectures, such as convolutional neural networks (CNNs) variants, do not consider the spatial hierarchies between features, and lose some information on the precise position of a feature within the feature region, which is crucial for a malware file which has specific sections. In this paper, we draw on the idea of image classification in the field of computer vision and propose a novel malware detection method based on capsule network architecture with hyper-parameter optimized convolutional layers (MalCaps), which overcomes CNNs limitations by removing the need for a pooling layer and introduces capsule layers. Firstly, the malware is transformed into a grayscale image. Then, the dynamic routing-based capsule network is used to detect and classify the image. Without advanced feature extraction and with only a small number of labeled samples, the presented method is tested on an unbalanced Microsoft Malware Classification Challenge (MMCC) dataset and experimental results produce testing accuracy of 99.34%, improving on a number of traditional deep learning models posited in recent malware classification literature.
As the object detection dataset scale is smaller than the image recognition dataset ImageNet scale, transfer learning has become a basic training method for deep learning object detection models, which pre-trains the backbone network of the object detection model on an ImageNet dataset to extract features for detection tasks. However, the classification task of detection focuses on the salient region features of an object, while the location task of detection focuses on the edge features, so there is a certain deviation between the features extracted by a pretrained backbone network and those needed by a localization task. To solve this problem, a decoupled self-attention (DSA) module is proposed for one-stage object-detection models in this paper. A DSA includes two decoupled self-attention branches, so it can extract appropriate features for different tasks. It is located between the Feature Pyramid Networks (FPN) and head networks of subtasks, and used to independently extract global features for different tasks based on FPN-fused features. Although the DSA network module is simple, it can effectively improve the performance of object detection, and can easily be embedded in many detection models. Our experiments are based on the representative one-stage detection model RetinaNet. In the Common Objects in Context (COCO) dataset, when ResNet50 and ResNet101 are used as backbone networks, the detection performances can be increased by 0.4 and 0.5% AP, respectively. When the DSA module and object confidence task are both applied in RetinaNet, the detection performances based on ResNet50 and ResNet101 can be increased by 1.0 and 1.4% AP, respectively. The experiment results show the effectiveness of the DSA module.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.