2024
DOI: 10.21203/rs.3.rs-4496133/v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Comparative Analysis of Convolutional Neural Network and Vision Transformer Embeddings on a Novel Domain-Specific Task

Nuriel Sahlom Mor

Abstract: The Vision Transformer (ViT) architecture utilized the self attention and transformer architecture originally designed for natural lan guage processing (NLP) enables ViTs to capture global relationships and long-range dependencies within images. The purpose of our study was to compare the performance of embeddings generated by Convolutional Neural Network (CNN) and Vision Transformers (ViT) on a novel domain-specific task which was not presented at any point to the models prior to the process of the embed… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 6 publications
(13 reference statements)
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?