2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020
DOI: 10.1109/cvpr42600.2020.01416
|View full text |Cite
|
Sign up to set email alerts
|

Sketchformer: Transformer-Based Representation for Sketched Structure

Abstract: Sketchformer is a novel transformer-based representation for encoding free-hand sketches input in a vector form, i.e. as a sequence of strokes. Sketchformer effectively addresses multiple tasks: sketch classification, sketch based image retrieval (SBIR), and the reconstruction and interpolation of sketches. We report several variants exploring continuous and tokenized input representations, and contrast their performance. Our learned embedding, driven by a dictionary learning tokenization scheme, yields state … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
12
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
2
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 53 publications
(14 citation statements)
references
References 34 publications
0
12
0
Order By: Relevance
“…This method achieves good recognition performance but endures a complex training process and a huge number of model parameters. Ribeiro et al 11 constructed a sketch recognition network based on the transformer structure that encodes sketches into feature vectors and uses the stroke sequence of sketches as input to the network, enhancing the network's ability to learn the stroke sequence of complex sketches. Jain et al 12 further designed the TransSketchNet in accordance with the transformer structure, which improves the network's ability to extract more valuable features by entirely using the stroke sequences and the attention mechanism.…”
Section: Learning-based Methods For Sketch Recognitionmentioning
confidence: 99%
See 3 more Smart Citations
“…This method achieves good recognition performance but endures a complex training process and a huge number of model parameters. Ribeiro et al 11 constructed a sketch recognition network based on the transformer structure that encodes sketches into feature vectors and uses the stroke sequence of sketches as input to the network, enhancing the network's ability to learn the stroke sequence of complex sketches. Jain et al 12 further designed the TransSketchNet in accordance with the transformer structure, which improves the network's ability to extract more valuable features by entirely using the stroke sequences and the attention mechanism.…”
Section: Learning-based Methods For Sketch Recognitionmentioning
confidence: 99%
“…This method achieves good recognition performance but endures a complex training process and a huge number of model parameters. Ribeiro et al 11 . constructed a sketch recognition network based on the transformer structure that encodes sketches into feature vectors and uses the stroke sequence of sketches as input to the network, enhancing the network’s ability to learn the stroke sequence of complex sketches.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…The ASR performance is reported to be better by combining the CTC loss with the attention mechanism [28] or using the Transformer structure [14,15]. In particular, the Transformer structure, which is originally designed to handle the natural language processing (NLP) problems [29,30], has been successfully utilized in several other domains, such as computer vision (CV) [31,32], and speech-related tasks including text to speech (TTS) [33,34,18,19], voice conversion (VC) [35], and ASR [12,13].…”
Section: Related Workmentioning
confidence: 99%