Human detection in videos plays an important role in various real life applications. Most of traditional approaches depend on utilizing handcrafted features which are problem-dependent and optimal for specific tasks. Moreover, they are highly susceptible to dynamical events such as illumination changes, camera jitter, and variations in object sizes. On the other hand, the proposed feature learning approaches are cheaper and easier because highly abstract and discriminative features can be produced automatically without the need of expert knowledge. In this paper, we utilize automatic feature learning methods which combine optical flow and three different deep models (i.e., supervised convolutional neural network (S-CNN), pretrained CNN feature extractor, and hierarchical extreme learning machine) for human detection in videos captured using a nonstatic camera on an aerial platform with varying altitudes. The models are trained and tested on the publicly available and highly challenging UCF-ARG aerial dataset. The comparison between these models in terms of training, testing accuracy, and learning speed is analyzed. The performance evaluation considers five human actions (digging, waving, throwing, walking, and running). Experimental results demonstrated that the proposed methods are successful for human detection task. Pretrained CNN produces an average accuracy of 98.09%. S-CNN produces an average accuracy of 95.6% with soft-max and 91.7% with Support Vector Machines (SVM). H-ELM has an average accuracy of 95.9%. Using a normal Central Processing Unit (CPU), H-ELM's training time takes 445 seconds. Learning in S-CNN takes 770 seconds with a high performance Graphical Processing Unit (GPU).
Video pornography and nudity detection aim to detect and classify people in videos into nude or normal for censorship purposes. Recent literature has demonstrated pornography detection utilising the convolutional neural network (CNN) to extract features directly from the whole frames and support vector machine (SVM) to classify the extracted features into two categories. However, existing methods were not able to detect the small-scale content of pornography and nudity in frames with diverse backgrounds. This limitation has led to a high false-negative rate (FNR) and misclassification of nude frames as normal ones. In order to address this matter, this paper explores the limitation of the existing convolutional-only approaches focusing the visual attention of CNN on the expected nude regions inside the frames to reduce the FNR. The You Only Look Once (YOLO) object detector was transferred to the pornography and nudity detection application to detect persons as regions of interest (ROIs), which were applied to CNN and SVM for nude/normal classification. Several experiments were conducted to compare the performance of various CNNs and classifiers using our proposed dataset. It was found that ResNet101 with random forest outperformed other models concerning the F1-score of 90.03% and accuracy of 87.75%. Furthermore, an ablation study was performed to demonstrate the impact of adding the YOLO before the CNN. YOLO–CNN was shown to outperform CNN-only in terms of accuracy, which was increased from 85.5% to 89.5%. Additionally, a new benchmark dataset with challenging content, including various human sizes and backgrounds, was proposed.
Given the excessive foul language identified in audio and video files and the detrimental consequences to an individual’s character and behaviour, content censorship is crucial to filter profanities from young viewers with higher exposure to uncensored content. Although manual detection and censorship were implemented, the methods proved tedious. Inevitably, misidentifications involving foul language owing to human weariness and the low performance in human visual systems concerning long screening time occurred. As such, this paper proposed an intelligent system for foul language censorship through a mechanized and strong detection method using advanced deep Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) through Long Short-Term Memory (LSTM) cells. Data on foul language were collected, annotated, augmented, and analysed for the development and evaluation of both CNN and RNN configurations. Hence, the results indicated the feasibility of the suggested systems by reporting a high volume of curse word identifications with only 2.53% to 5.92% of False Negative Rate (FNR). The proposed system outperformed state-of-the-art pre-trained neural networks on the novel foul language dataset and proved to reduce the computational cost with minimal trainable parameters.
Rivers carry suspended sediments along with their flow. These sediments deposit at different places depending on the discharge and course of the river. However, the deposition of these sediments impacts environmental health, agricultural activities, and portable water sources. Deposition of suspended sediments reduces the flow area, thus affecting the movement of aquatic lives and ultimately leading to the change of river course. Thus, the data of suspended sediments and their variation is crucial information for various authorities. Various authorities require the forecasted data of suspended sediments in the river to operate various hydraulic structures properly. Usually, the prediction of suspended sediment concentration (SSC) is challenging due to various factors, including site-related data, site-related modelling, lack of multiple observed factors used for prediction, and pattern complexity.Therefore, to address previous problems, this study proposes a Long Short Term Memory model to predict suspended sediments in Malaysia's Johor River utilizing only one observed factor, including discharge data. The data was collected for the period of 1988–1998. Four different models were tested, in this study, for the prediction of suspended sediments, which are: ElasticNet Linear Regression (L.R.), Multi-Layer Perceptron (MLP) neural network, Extreme Gradient Boosting, and Long Short-Term Memory. Predictions were analysed based on four different scenarios such as daily, weekly, 10-daily, and monthly. Performance evaluation stated that Long Short-Term Memory outperformed other models with the regression values of 92.01%, 96.56%, 96.71%, and 99.45% daily, weekly, 10-days, and monthly scenarios, respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.