The Bag-of-Visual-Words (BoVW) model is widely used for image classification, object recognition and image retrieval problems. In BoVW model, the local features are quantized and 2-D image space is represented in the form of order-less histogram of visual words. The image classification performance suffers due to the order-less representation of image. This paper presents a novel image representation that incorporates the spatial information to the inverted index of BoVW model. The spatial information is added by calculating the global relative spatial orientation of visual words in a rotation invariant manner. For this, we computed the geometric relationship between triplets of identical visual words by calculating an orthogonal vector relative to each point in the triplets of identical visual words. The histogram of visual words is calculated on the basis of the magnitude of these orthogonal vectors. This calculation provides the unique information regarding the relative position of visual words when they are collinear. The proposed image representation is evaluated by using four standard image benchmarks. The experimental results and quantitative comparisons demonstrate that the proposed image representation outperforms the existing state-of-the-art in terms of classification accuracy.
Melanoma is considered the most serious type of skin cancer. All over the world, the mortality rate is much high for melanoma in contrast with other cancer. There are various computer-aided solutions proposed to correctly identify melanoma cancer. However, the difficult visual appearance of the nevus makes it very difficult to design a reliable Computer-Aided Diagnosis (CAD) system for accurate melanoma detection. Existing systems either uses traditional machine learning models and focus on handpicked suitable features or uses deep learning-based methods that use complete images for feature learning. The automatic and most discriminative feature extraction for skin cancer remains an important research problem that can further be used to better deep learning training. Furthermore, the availability of the limited available images also creates a problem for deep learning models. From this line of research, we propose an intelligent Region of Interest (ROI) based system to identify and discriminate melanoma with nevus cancer by using the transfer learning approach. An improved k-mean algorithm is used to extract ROIs from the images. These ROI based approach helps to identify discriminative features as the images containing only melanoma cells are used to train system. We further use a Convolutional Neural Network (CNN) based transfer learning model with data augmentation for ROI images of DermIS and DermQuest datasets. The proposed system gives 97.9% and 97.4% accuracy for DermIS and DermQuest respectively. The proposed ROI based transfer learning approach outperforms existing methods that use complete images for classification.
Due to recent development in technology, the complexity of multimedia is significantly increased and the retrieval of similar multimedia content is a open research problem. Content-Based Image Retrieval (CBIR) is a process that provides a framework for image search and low-level visual features are commonly used to retrieve the images from the image database. The basic requirement in any image retrieval process is to sort the images with a close similarity in term of visually appearance. The color, shape and texture are the examples of low-level image features. The feature plays a significant role in image processing. The powerful representation of an image is known as feature vector and feature extraction techniques are applied to get features that will be useful in classifying and recognition of images. As features define the behavior of an image, they show its place in terms of storage taken, efficiency in classification and obviously in time consumption also. In this paper, we are going to discuss various types of features, feature extraction techniques and explaining in what scenario, which features extraction technique will be better. The effectiveness of the CBIR approach is fundamentally based on feature extraction. In image processing errands like object recognition and image retrieval feature descriptor is an immense among the most essential step. The main idea of CBIR is that it can search related images to an image passed as query from a dataset got by using distance metrics. The proposed method is explained for image retrieval constructed on YCbCr color with canny edge histogram and discrete wavelet transform. The combination of edge of histogram and discrete wavelet transform increase the performance of image retrieval framework for content based search. The execution of different wavelets is additionally contrasted with discover the suitability of specific wavelet work for image retrieval. The proposed algorithm is prepared and tried to implement for Wang image database. For Image Retrieval Purpose, Artificial Neural Networks (ANN) is used and applied on standard dataset in CBIR domain. The execution of the recommended descriptors is assessed by computing both Precision and Recall values and compared with different other proposed methods with demonstrate the predominance of our method. The efficiency and effectiveness of the proposed approach outperforms the existing research in term of average precision and recall values.
One of the major requirements of content based image retrieval (CBIR) systems is to ensure meaningful image retrieval against query images. The performance of these systems is severely degraded by the inclusion of image content which does not contain the objects of interest in an image during the image representation phase. Segmentation of the images is considered as a solution but there is no technique that can guarantee the object extraction in a robust way. Another limitation of the segmentation is that most of the image segmentation techniques are slow and their results are not reliable. To overcome these problems, a bandelet transform based image representation technique is presented in this paper, which reliably returns the information about the major objects found in an image. For image retrieval purposes, artificial neural networks (ANN) are applied and the performance of the system and achievement is evaluated on three standard data sets used in the domain of CBIR.
The requirement for effective image search, which motivates the use of Content-Based Image Retrieval (CBIR) and the search of similar multimedia contents on the basis of user query, remains an open research problem for computer vision applications. The application domains for Bag of Visual Words (BoVW) based image representations are object recognition, image classification and content-based image analysis. Interest point detectors are quantized in the feature space and the final histogram or image signature do not retain any detail about co-occurrences of features in the 2D image space. This spatial information is crucial, as it adversely affects the performance of an image classification-based model. The most notable contribution in this context is Spatial Pyramid Matching (SPM), which captures the absolute spatial distribution of visual words. However, SPM is sensitive to image transformations such as rotation, flipping and translation. When images are not well-aligned, SPM may lose its discriminative power. This paper introduces a novel approach to encoding the relative spatial information for histogram-based representation of the BoVW model. This is established by computing the global geometric relationship between pairs of identical visual words with respect to the centroid of an image. The proposed research is evaluated by using five different datasets. Comprehensive experiments demonstrate the robustness of the proposed image representation as compared to the state-of-the-art methods in terms of precision and recall values.
As digital images play a vital role in multimedia content, the automatic classification of images is an open research problem. The Bag of Visual Words (BoVW) model is used for image classification, retrieval and object recognition problems. In the BoVW model, a histogram of visual words is computed without considering the spatial layout of the 2-D image space. The performance of BoVW suffers due to a lack of information about spatial details of an image. Spatial Pyramid Matching (SPM) is a popular technique that computes the spatial layout of the 2-D image space. However, SPM is not rotation-invariant and does not allow a change in pose and view point, and it represents the image in a very high dimensional space. In this paper, the spatial contents of an image are added and the rotations are dealt with efficiently, as compared to approaches that incorporate spatial contents. The spatial information is added by constructing the histogram of circles, while rotations are dealt with by using concentric circles. A weighed scheme is applied to represent the image in the form of a histogram of visual words. Extensive evaluation of benchmark datasets and the comparison with recent classification models demonstrate the effectiveness of the proposed approach. The proposed representation outperforms the state-of-the-art methods in terms of classification accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citationsācitations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright Ā© 2024 scite LLC. All rights reserved.
Made with š for researchers
Part of the Research Solutions Family.