Reconstruction-based and prediction-based approaches are widely used for video anomaly detection (VAD) in smart city surveillance applications. However, neither of these approaches can effectively utilize the rich contextual information that exists in videos, which makes it difficult to accurately perceive anomalous activities. In this paper, we exploit the idea of a training model based on the “Cloze Test” strategy in natural language processing (NLP) and introduce a novel unsupervised learning framework to encode both motion and appearance information at an object level. Specifically, to store the normal modes of video activity reconstructions, we first design an optical stream memory network with skip connections. Secondly, we build a space–time cube (STC) for use as the basic processing unit of the model and erase a patch in the STC to form the frame to be reconstructed. This enables a so-called ”incomplete event (IE)” to be completed. On this basis, a conditional autoencoder is utilized to capture the high correspondence between optical flow and STC. The model predicts erased patches in IEs based on the context of the front and back frames. Finally, we employ a generating adversarial network (GAN)-based training method to improve the performance of VAD. By distinguishing the predicted erased optical flow and erased video frame, the anomaly detection results are shown to be more reliable with our proposed method which can help reconstruct the original video in IE. Comparative experiments conducted on the benchmark UCSD Ped2, CUHK Avenue, and ShanghaiTech datasets demonstrate AUROC scores reaching 97.7%, 89.7%, and 75.8%, respectively.
Synthetic Aperture Radar Automatic Target Recognition (SAR ATR) is one of the most important research directions in SAR image interpretation. While much existing research into SAR ATR has focused on deep learning technology, an equally important yet underexplored problem is its deployment in incremental learning scenarios. This letter proposes a new benchmark approach, termed Memory augmented weights alignment and Enhancement Discrimination Incremental Learning (MEDIL) algorithm to address this issue. Firstly, the attention mechanism is employed as part of the benchmark. Next, we discuss the problem of height deviation of weights at the fully connected layer and design a more suitable alignment of weights by guiding the memory module for contextual data processing. In addition, we leverage the incremental progressive sampling strategy to alleviate the imbalance between old and new classes during the training period. Finally, we propose to enhance the distinction among various classes with an angular penalty loss function to ensure the diversity of incremental instances. The proposed method is evaluated on MSTAR and OpenSARShip under different experimental settings. Experimental results demonstrate that our proposed approach can effectively solve catastrophic forgetting in SAR multiclass recognition problems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.