Guangle Yao scite author profile

Vehicle detection in aerial images is an important and challenging task. Traditionally, many target detection models based on sliding-window fashion were developed and achieved acceptable performance, but these models are time-consuming in the detection phase. Recently, with the great success of convolutional neural networks (CNNs) in computer vision, many state-of-the-art detectors have been designed based on deep CNNs. However, these CNN-based detectors are inefficient when applied in aerial image data due to the fact that the existing CNN-based models struggle with small-size object detection and precise localization. To improve the detection accuracy without decreasing speed, we propose a CNN-based detection model combining two independent convolutional neural networks, where the first network is applied to generate a set of vehicle-like regions from multi-feature maps of different hierarchies and scales. Because the multi-feature maps combine the advantage of the deep and shallow convolutional layer, the first network performs well on locating the small targets in aerial image data. Then, the generated candidate regions are fed into the second network for feature extraction and decision making. Comprehensive experiments are conducted on the Vehicle Detection in Aerial Imagery (VEDAI) dataset and Munich vehicle dataset. The proposed cascaded detection model yields high performance, not only in detection accuracy but also in detection speed.

show abstract

Learning multi-temporal-scale deep information for action recognition

Yao

Leí

Zhong

et al. 2018

Appl Intell

View full text Add to dashboard Cite

Comparative Evaluation of Background Subtraction Algorithms in Remote Scene Videos Captured by MWIR Sensors

Yao

Leí

Zhong

et al. 2017

Sensors

View full text Add to dashboard Cite

Background subtraction (BS) is one of the most commonly encountered tasks in video analysis and tracking systems. It distinguishes the foreground (moving objects) from the video sequences captured by static imaging sensors. Background subtraction in remote scene infrared (IR) video is important and common to lots of fields. This paper provides a Remote Scene IR Dataset captured by our designed medium-wave infrared (MWIR) sensor. Each video sequence in this dataset is identified with specific BS challenges and the pixel-wise ground truth of foreground (FG) for each frame is also provided. A series of experiments were conducted to evaluate BS algorithms on this proposed dataset. The overall performance of BS algorithms and the processor/memory requirements were compared. Proper evaluation metrics or criteria were employed to evaluate the capability of each BS algorithm to handle different kinds of BS challenges represented in this dataset. The results and conclusions in this paper provide valid references to develop new BS algorithm for remote scene IR video sequence, and some of them are not only limited to remote scene or IR video sequence but also generic for background subtraction. The Remote Scene IR dataset and the foreground masks detected by each evaluated BS algorithm are available online: .

show abstract

Action Recognition with 3D ConvNet-GRU Architecture

Yao

Liu

Leí

2018

View full text Add to dashboard Cite

MAPD: An improved multi-attribute pedestrian detection in a crowd

et al. 2021

View full text Add to dashboard Cite

A Novel Weakly Supervised Remote Sensing Landslide Semantic Segmentation Method: Combining CAM and cycleGAN Algorithms

et al. 2022

View full text Add to dashboard Cite

With the development of deep learning algorithms, more and more deep learning algorithms are being applied to remote sensing image classification, detection, and semantic segmentation. The landslide semantic segmentation of a remote sensing image based on deep learning mainly uses supervised learning, the accuracy of which depends on a large number of training data and high-quality data annotation. At this stage, high-quality data annotation often requires the investment of significant human effort. Therefore, the high cost of remote sensing landslide image data annotation greatly restricts the development of a landslide semantic segmentation algorithm. Aiming to resolve the problem of the high labeling cost of landslide semantic segmentation with a supervised learning method, we proposed a remote sensing landslide semantic segmentation with weakly supervised learning method combing class activation maps (CAMs) and cycle generative adversarial network (cycleGAN). In this method, we used the image level annotation data to replace pixel level annotation data as the training data. Firstly, the CAM method was used to determine the approximate position of the landslide area. Then, the cycleGAN method was used to generate the fake image without a landslide, and to make the difference with the real image to obtain the accurate segmentation of the landslide area. Finally, the pixel-level segmentation of the landslide area on remote sensing image was realized. We used mean intersection-over-union (mIOU) to evaluate the proposed method, and compared it with the method based on CAM, whose mIOU was 0.157, and we obtain better result with mIOU 0.237 on the same test dataset. Furthermore, we made a comparative experiment using the supervised learning method of a u-net network, and the mIOU result was 0.408. The experimental results show that it is feasible to realize landslide semantic segmentation in a remote sensing image by using weakly supervised learning. This method can greatly reduce the workload of data annotation.

show abstract

Single image deraining via deep shared pyramid network

et al. 2020

View full text Add to dashboard Cite

12 3 4

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Guangle Yao

A review of Convolutional-Neural-Network-based action recognition

Robust Vehicle Detection in Aerial Images Based on Cascaded Convolutional Neural Networks

Learning multi-temporal-scale deep information for action recognition

Comparative Evaluation of Background Subtraction Algorithms in Remote Scene Videos Captured by MWIR Sensors

Action Recognition with 3D ConvNet-GRU Architecture

MAPD: An improved multi-attribute pedestrian detection in a crowd

A Novel Weakly Supervised Remote Sensing Landslide Semantic Segmentation Method: Combining CAM and cycleGAN Algorithms

Single image deraining via deep shared pyramid network

Contact Info

Product

Resources

About