2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017
DOI: 10.1109/cvpr.2017.444
|View full text |Cite
|
Sign up to set email alerts
|

Video2Shop: Exact Matching Clothes in Videos to Online Shopping Images

Abstract: In recent years, both online retail and video hosting service are exponentially growing. In this paper, we explore a new cross-domain task, Video2Shop, targeting for matching clothes appeared in videos to the exact same items in online shops. A novel deep neural network, called AsymNet, is proposed to explore this problem. For the image side, wellestablished methods are used to detect and extract features for clothing patches with arbitrary sizes. For the video side, deep visual features are extracted from det… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
48
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 72 publications
(48 citation statements)
references
References 30 publications
(60 reference statements)
0
48
0
Order By: Relevance
“…Street2Shop [25] DARN [22] DeepFashion [ [22,25,34,24,44,9,8,58]. These methods usually follow a global similarity computation and matching pipeline, i.e.…”
Section: Datasetsmentioning
confidence: 99%
See 2 more Smart Citations
“…Street2Shop [25] DARN [22] DeepFashion [ [22,25,34,24,44,9,8,58]. These methods usually follow a global similarity computation and matching pipeline, i.e.…”
Section: Datasetsmentioning
confidence: 99%
“…There also exist some variants, such as dialog based clothes search [17] , video based clothes retrieval [8], and attribute feedback based clothes retrieval [18,59]. Their application scenarios and settings are different from ours.…”
Section: Datasetsmentioning
confidence: 99%
See 1 more Smart Citation
“…[43,26]. Another line of work considers retrieving fashion images based on various forms of queries, including images [26,35,52], attributes [8,1], occasions [24], videos [6], and user preferences [16]. Our work is closer to the 'cross-scenario' fashion retrieval setting (called street2shop) which seeks to retrieve fashion products appearing in street photos [25,17], as the same type of data can be adapted to our setting.…”
Section: Related Workmentioning
confidence: 99%
“…We use convolutional layers with one output channel to reduce the feature dimension. Since training images have different size and inspired by the previous work [14], one spatial pyramid pooling (SPP) layer is applied to reshape the features from the last convolutional layer into a fixed dimension. Finally, two fully connected layers are employed as a classifier.…”
Section: Network Architecturesmentioning
confidence: 99%