Abstract:Query by Image Content (QBIC), subsequently known as Content-Based Image Retrieval (CBIR) systems, may offer a more advantageous solution in a variety of applications, including medical, meteorological, search by image, and others. Such systems primarily use similarity matching algorithms to compare image content to get their relevance from databases. They are essentially measuring the spatial distance between extracted visual features from a query image and their correspondence in the dataset. One of the most… Show more
“…Database images which are out of interval are ignored which causes the model's speed up. Recent CBIR systems use convolutional layers as deep learning architecture for feature extraction [2], [5], [8], [13]. Some of The most Recent CBIR systems use Transformers for feature extraction that has led to satisfactory results [3], [4], [6], [7], [8].…”
Section: Introductionmentioning
confidence: 99%
“…Recent CBIR systems use convolutional layers as deep learning architecture for feature extraction [2], [5], [8], [13]. Some of The most Recent CBIR systems use Transformers for feature extraction that has led to satisfactory results [3], [4], [6], [7], [8]. Transformers have made a big evolution in AI even in image processing.…”
Section: Introductionmentioning
confidence: 99%
“…Transformers have made a big evolution in AI even in image processing. [4], [6], [7] and [8] used Vision Transformer as a feature extractor.…”
Section: Introductionmentioning
confidence: 99%
“…[6] Used Vison Transformer with metric learning objective. [8] Used Vision Transformer for Sketched-Real Image Retrieval (SRIR) beside Info-GAN on ESRIR dataset. I used BEIT Transformer that gained better top1-accuracy than Vision Transformer in ImageNet dataset for classification task [15].…”
I present DHam, a new and exact unsupervised learning model for Content Based Image Retrieval (CBIR) that does not need any training data set. DHam is accurate especially when you deal with the multiple objects image with background (MOIB). This is the first time that pre-trained Detic and pre-trained of a self-supervised based image transformer (SSIT) BEIT, have been mixed for CBIR. First, I use pre-trained Detic to detect image objects. Then I extract every object's feature with pre-trained BEIT. DHam shows its superiority when search is accomplished amongst multiple objects images with background (MOIBs). Besides, DHam Test results are compared with pure BEIT and pure ResNet CBIR models. On the other hand, it is not a fast model. It takes around 19 and 273 seconds to compare the input image with 44,891 and 1,868,672 features respectively. Compared to state of the art CBIR systems, DHam may bring irrelevant images but is less likely to miss the target similar image.
“…Database images which are out of interval are ignored which causes the model's speed up. Recent CBIR systems use convolutional layers as deep learning architecture for feature extraction [2], [5], [8], [13]. Some of The most Recent CBIR systems use Transformers for feature extraction that has led to satisfactory results [3], [4], [6], [7], [8].…”
Section: Introductionmentioning
confidence: 99%
“…Recent CBIR systems use convolutional layers as deep learning architecture for feature extraction [2], [5], [8], [13]. Some of The most Recent CBIR systems use Transformers for feature extraction that has led to satisfactory results [3], [4], [6], [7], [8]. Transformers have made a big evolution in AI even in image processing.…”
Section: Introductionmentioning
confidence: 99%
“…Transformers have made a big evolution in AI even in image processing. [4], [6], [7] and [8] used Vision Transformer as a feature extractor.…”
Section: Introductionmentioning
confidence: 99%
“…[6] Used Vison Transformer with metric learning objective. [8] Used Vision Transformer for Sketched-Real Image Retrieval (SRIR) beside Info-GAN on ESRIR dataset. I used BEIT Transformer that gained better top1-accuracy than Vision Transformer in ImageNet dataset for classification task [15].…”
I present DHam, a new and exact unsupervised learning model for Content Based Image Retrieval (CBIR) that does not need any training data set. DHam is accurate especially when you deal with the multiple objects image with background (MOIB). This is the first time that pre-trained Detic and pre-trained of a self-supervised based image transformer (SSIT) BEIT, have been mixed for CBIR. First, I use pre-trained Detic to detect image objects. Then I extract every object's feature with pre-trained BEIT. DHam shows its superiority when search is accomplished amongst multiple objects images with background (MOIBs). Besides, DHam Test results are compared with pure BEIT and pure ResNet CBIR models. On the other hand, it is not a fast model. It takes around 19 and 273 seconds to compare the input image with 44,891 and 1,868,672 features respectively. Compared to state of the art CBIR systems, DHam may bring irrelevant images but is less likely to miss the target similar image.
“…smart phones and tablets. With their wide uses, a great number of digital photos have been taken by people anytime and anywhere, which are usually delivered to and then stored in cloud servers [1], [2], [3], [4], [5]. For any image in the enormous amount of photos, there may be some other images with similar contents, which thus can be called similar images.…”
As an important paradigm of image set management, image insertion refers to adding new photos to an existing compressed image set. Recently, several algorithms have made significant progress in image insertion. However, due to complexity image relationships, coding performance still remains to rise. To address the issue, in this paper a high-coding-efficiency image insertion algorithm is proposed for compressed image sets. In order to maximize coding performance, it is necessary for each new picture to thoroughly exploit the correlations between it and all the other images. To be specific, in our proposed approach images are first divided into two kinds: to-be-inserted images and compressed images, where the former includes new photographs and the latter composes that compressed image set. Second, a depth-and topology-constrained minimum spanning tree (DTCMST) heuristics is also proposed, to fully investigate the relationships not only between to-be-inserted images and compressed images but among different to-beinserted images. With the generated DTCMST, the depth requirement and the topology structure of the existing compressed image set can be satisfied and kept unchanged, respectively. Finally, after the encoding of every to-be-inserted image by using its assigned parent vertex from the DTCMST as its prediction reference, a new compressed image is eventually established. Compared with state-of-the-art methods, experimental results show that average bit rate saving and Bjøntegaard delta peak signal-to-noise ratio enhancement are achieved up to 5.1 % and 0.41 dB by using our proposed algorithm, respectively, with the similar computational complexity.INDEX TERMS Image insertion, compressed image set, minimum spanning tree, coding efficiency, image set management.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.