Advancements in the sector of computer and multimedia technology and introduction of the World Wide Web have increased the volume of image databases and collections, for example medical imageries, digital libraries, art galleries which in total contain millions of images. The retrieval process of images from such huge database by traditional methods such as Text Based Image Retrieval, Color Histogram and Chi Square Distance may take a lot of time to get the desired images. It is necessity to develop an effective image retrieval system which can handle these huge amounts of images at once. The main purpose is to build a robust system that builds, executes and responds to data in an efficient manner. A Content-Based Image Retrieval (CBIR) system has been developed as an efficient image retrieval tool where user can provide their query to the system to allow it to retrieve user's desired image from the image collection. Moreover, the emergence of web development and transmission networks and also the number of images which are available to users continue to grow. We propose an effective deep learning framework based on Convolution Neural Networks (CNN) and Support Vector Machine (SVM) for fast image retrieval. Proposed architecture extracts features using CNN and classification using SVM. The results demonstrate the robustness of the system.
Weed management has a vital role in applications of agriculture domain. One of the key tasks is to identify the weeds after few days of plant germination which helps the farmers to perform early-stage weed management to reduce the contrary impacts on crop growth. Thus, we aim to classify the seedlings of crop and weed species. In this work, we propose a plant seedlings classification using the benchmark plant seedlings dataset. The dataset contains the images of 12 different species where three belongs to plant species and the other nine belongs to weed species. We implement the classification framework using three different deep convolutional neural network architectures, namely ResNet50V2, MobileNetV2 and EfficientNetB0. We train the models using transfer learning and compare the performance of each model on a test dataset of 833 images. We compare the three models and demonstrate that the EfficientNetB0 performs better with an average F1-Score of 96.26% and an accuracy of 96.52%.
Manual tumor diagnosis from magnetic resonance images (MRIs) is a time-consuming procedure that may lead to human errors and may lead to false detection and classification of the tumor type. Therefore, to automatize the complex medical processes, a deep learning framework is proposed for brain tumor classification to ease the task of doctors for medical diagnosis. Publicly available datasets such as Kaggle and Brats are used for the analysis of brain images. The proposed model is implemented on three pre-trained Deep Convolution Neural Network architectures (DCNN) such as AlexNet, VGG16, and ResNet50. These architectures are the transfer learning methods used to extract the features from the pre-trained DCNN architecture, and the extracted features are classified by using the Support Vector Machine (SVM) classifier. Data augmentation methods are applied on Magnetic Resonance images (MRI) to avoid the network from overfitting. The proposed methodology achieves an overall accuracy of 98.28% and 97.87% without data augmentation and 99.0% and 98.86% with data augmentation for Kaggle and Brat's datasets, respectively. The Area Under Curve (AUC) for Receiver Operator Characteristic (ROC) is 0.9978 and 0.9850 for the same datasets. The result shows that ResNet50 performs best in the classification of brain tumors when compared with the other two networks.
Intelligent decision-making systems require the potential for forecasting, foreseeing, and reasoning about future events. The issue of video frame prediction has aroused a lot of attention due to its usefulness in many computer vision applications such as autonomous vehicles and robots. Recent deep learning advances have significantly improved video prediction performance. Nevertheless, as top-performing systems attempt to foresee even more future frames, their predictions become increasingly foggy. We developed a method for predicting a future frame based on a series of prior frames that services the Convolutional Long-Short Term Memory (ConvLSTM) model. The input video is segmented into frames, fed to the ConvLSTM model to extract the features and forecast a future frame which can be beneficial in a variety of applications. We have used two metrics to measure the quality of the predicted frame: structural similarity index (SSIM) and perceptual distance, which help in understanding the difference between the actual frame and the predicted frame. The UCF101 data set is used for testing and training in the project. It is a data collection of realistic action videos taken from YouTube with 101 action categories for action detection. The ConvLSTM model is trained and tested for 24 categories from this dataset and a future frame is predicted which yields satisfactory results. We obtained SSIM as 0.95 and perceptual similarity as 24.28 for our system. The suggested work’s results are also compared to those of state-of-the-art approaches, which are shown to be superior.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.