2019
DOI: 10.1109/access.2019.2923552
|View full text |Cite
|
Sign up to set email alerts
|

DeepStyle: Multimodal Search Engine for Fashion and Interior Design

Abstract: In this paper, we propose a multimodal search engine that combines visual and textual cues to retrieve items from a multimedia database aesthetically similar to the query. The goal of our engine is to enable intuitive retrieval of fashion merchandise such as clothes or furniture. Existing search engines treat textual input only as an additional source of information about the query image and do not correspond to the real-life scenario where the user looks for "the same shirt but of denim". Our novel method, du… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
16
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 53 publications
(16 citation statements)
references
References 40 publications
(52 reference statements)
0
16
0
Order By: Relevance
“…However, our convolution network is novel in learning not only content features but also each user's preferences via the end-to-end framework with the triplet loss. Compared with recent methods for Web content retrieval [41,69] that use content features extracted from the pre-trained convolution network, this novelty is unique. Fig.…”
Section: Feature Extraction Methods For Real-world Applicationmentioning
confidence: 99%
“…However, our convolution network is novel in learning not only content features but also each user's preferences via the end-to-end framework with the triplet loss. Compared with recent methods for Web content retrieval [41,69] that use content features extracted from the pre-trained convolution network, this novelty is unique. Fig.…”
Section: Feature Extraction Methods For Real-world Applicationmentioning
confidence: 99%
“…Image classification models often employ CNNs to extract features related to shapes and textures in an image and to generate predictions of relevant attributes [17]. Similar ideas have been extended to clothing recommendation [18] and fashion image retrieval [19], [20]. To improve the performance of classification, several studies introduced multi-task learning (MTL) into their methods [21], [22].…”
Section: A Fashion Attribute Recognitionmentioning
confidence: 99%
“…Designers and users put forward their requirements through images and text, search for related product images from databases or e-commerce websites, and the matched images will be recommended to designers and users as design references. The retrieval input can be text, images, or both of them [162,163,164,165,166]. For product, the input image provided by designers and users may be taken by their phone on the street or in a store, which is quite different from image databases and e-commerce websites in terms of shooting angle, condition, background, or posture [167,168,169,170,171].…”
Section: Product Design Based On Image Datamentioning
confidence: 99%