Security has always been of paramount importance to humans. In the absence of a sense of security at one’s workplace, home or anywhere else, people feel uneasy and vulnerable. With the improvement of modern technology, along with the lack of time at hand, the need for faster, efficient, accurate as well as low-cost security techniques is more than ever. Image Captioning for Video Surveillance System is proposed to develop visual systems that generate contextual descriptions about objects in images, and then use these descriptions to provide information of the scene that needs to be secured. The proposed system uses a neural network model composed of a Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) to caption the incoming video feed. The main significance of this paper is in integrating the system with Discrete Wavelet Transform (DWT), which is applied on the incoming video feed, so that the compressed LL band frames transferred wirelessly to the model are smaller in comparison, leading to less transfer time and faster processing by the model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.