Mariya I. Vasileva scite author profile

Outfits in online fashion data are composed of items of many different types (e.g. top, bottom, shoes) that share some stylistic relationship with one another. A representation for building outfits requires a method that can learn both notions of similarity (for example, when two tops are interchangeable) and compatibility (items of possibly different type that can go together in an outfit). This paper presents an approach to learning an image embedding that respects item type, and jointly learns notions of item similarity and compatibility in an end-toend model. To evaluate the learned representation, we crawled 68,306 outfits created by users on the Polyvore website. Our approach obtains 3-5% improvement over the state-of-the-art on outfit compatibility prediction and fill-in-the-blank tasks using our dataset, as well as an established smaller dataset, while supporting a variety of useful queries 1 .

show abstract

Learning Similarity Conditions Without Explicit Supervision

Tan

Vasileva

Saenko

et al. 2019

117

View full text Add to dashboard Cite

Many real-world tasks require models to compare images along multiple similarity conditions (e.g. similarity in color, category or shape). Existing methods often reason about these complex similarity relationships by learning condition-aware embeddings. While such embeddings aid models in learning different notions of similarity, they also limit their capability to generalize to unseen categories since they require explicit labels at test time. To address this deficiency, we propose an approach that jointly learns representations for the different similarity conditions and their contributions as a latent variable without explicit supervision. Comprehensive experiments 1 across three datasets, Polyvore-Outfits, Maryland-Polyvore and UT-Zappos50k, demonstrate the effectiveness of our approach: our model outperforms the state-of-the-art methods, even those that are strongly supervised with pre-defined similarity conditions, on fill-in-the-blank, outfit compatibility prediction and triplet prediction tasks. Finally, we show that our model learns different visually-relevant semantic sub-spaces that allow it to generalize well to unseen categories.

show abstract

Learning Type-Aware Embeddings for Fashion Compatibility

Vasileva¹,

Plummer²,

Dusad³

et al. 2018

Preprint

View full text Add to dashboard Cite

OutfitTransformer: Outfit Representations for Fashion Recommendation

Sarkar

Bodla

Vasileva

et al. 2022

View full text Add to dashboard Cite

OutfitTransformer: Learning Outfit Representations for Fashion Recommendation

Sarkar

Bodla

Vasileva

et al. 2023

View full text Add to dashboard Cite

Why Do These Match? Explaining the Behavior of Image Similarity Models

Plummer

Vasileva

Petsiuk

et al. 2020

View full text Add to dashboard Cite

Why do These Match? Explaining the Behavior of Image Similarity Models

Plummer¹,

Vasileva²,

Petsiuk³

et al. 2019

Preprint

View full text Add to dashboard Cite

Explaining a deep learning model can help users understand its behavior and allow researchers to discern its shortcomings. Recent work has primarily focused on explaining models for tasks like image classification or visual question answering. In this paper, we introduce an explanation approach for image similarity models, where a model's output is a semantic feature representation rather than a classification. In this task, an explanation depends on both of the input images, so standard methods do not apply. We propose an explanation method that pairs a saliency map identifying important image regions with an attribute that best explains the match. We find that our explanations are more human-interpretable than saliency maps alone, and can also improve performance on the classic task of attribute recognition. The ability of our approach to generalize is demonstrated on two datasets from very different domains, Polyvore Outfits and Animals with Attributes 2. * Equal Contribution Preprint. Under review.

show abstract

HandsOff: Labeled Dataset Generation With No Additional Human Annotations

Xu¹,

Vasileva²,

Dave³

et al. 2022

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.