Who, Where, and What to Wear?

Ma, Yunshan; Yang, Xun; Liao, Lizi; Cao, Yixin; Chua, Tat-Seng

doi:10.1145/3343031.3350889

Cited by 25 publications

(3 citation statements)

References 34 publications

(53 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Then, we explore how to leverage the retrieved comments in multimodal classification and exploit a self-training framework to identify comments' hints which shape the cross-modal understanding (henceforth comment-aware self-training). This considers method feasibility in scenarios where large-scale labeled data is unavailable, which commonly appears in the realistic practice, because the annotation for multimodal data from social media is extremely expensive (Ma et al, 2019). Concretely, we adopt a teacher-student prototype (Meng et al, 2020;Shen et al, 2021) and tailor-make it to learn multimodal understanding with the help of user comments.…”

Section: Textmentioning

confidence: 99%

See 1 more Smart Citation

Borrowing Human Senses: Comment-Aware Self-Training for Social Media Multimodal Classification

Xu¹,

Li²

2022

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Social media is daily creating massive multimedia content with paired image and text, presenting the pressing need to automate the vision and language understanding for various multimodal classification tasks. Compared to the commonly researched visual-lingual data, social media posts tend to exhibit more implicit image-text relations. To better glue the crossmodal semantics therein, we capture hinting features from user comments, which are retrieved via jointly leveraging visual and lingual similarity. Afterwards, the classification tasks are explored via self-training in a teacherstudent framework, motivated by the usually limited labeled data scales in existing benchmarks. Substantial experiments are conducted on four multimodal social media benchmarks for image-text relation classification, sarcasm detection, sentiment classification, and hate speech detection. The results show that our method further advances the performance of previous state-of-the-art models, which do not employ comment modeling or self-training.

show abstract

Section: Textmentioning

confidence: 99%

“…The labeled dataset L is usually limited in scales (Ma et al, 2019), posing the over-fitting concern. Meanwhile the retrieved posts, similar to the data in L, could form an unlabeled set (U = {x ′ i , c i } Kl i=1 ) to enrich training data.…”

Section: Self-training With Retrieved Postsmentioning

confidence: 99%

Borrowing Human Senses: Comment-Aware Self-Training for Social Media Multimodal Classification

Xu¹,

Li²

2022

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

show abstract

“…According to different studies, e-commerce retailers, such as Amazon, eBay, and Shopstyle, and social networking sites, such as Pinterest, Snapchat, Instagram, Facebook, Chictopia, and Lookbook, are now regarded as the most popular media for fashion advice and recommendations [15][16][17][18][19][20][21][22]. Research on textual content, such as posts and comments [23], emotion and information diffusion [24], and images has attracted the attention of modernday researchers, as it can help to predict fashion trends and facilitate the development of effective recommendation systems [5,[25][26][27]]. An effective recommendation system is a crucial tool for successfully conducting an e-commerce business.…”

Section: Introductionmentioning

confidence: 99%

Fashion Recommendation Systems, Models and Methods: A Review

et al. 2021

View full text Add to dashboard Cite

In recent years, the textile and fashion industries have witnessed an enormous amount of growth in fast fashion. On e-commerce platforms, where numerous choices are available, an efficient recommendation system is required to sort, order, and efficiently convey relevant product content or information to users. Image-based fashion recommendation systems (FRSs) have attracted a huge amount of attention from fast fashion retailers as they provide a personalized shopping experience to consumers. With the technological advancements, this branch of artificial intelligence exhibits a tremendous amount of potential in image processing, parsing, classification, and segmentation. Despite its huge potential, the number of academic articles on this topic is limited. The available studies do not provide a rigorous review of fashion recommendation systems and the corresponding filtering techniques. To the best of the authors’ knowledge, this is the first scholarly article to review the state-of-the-art fashion recommendation systems and the corresponding filtering techniques. In addition, this review also explores various potential models that could be implemented to develop fashion recommendation systems in the future. This paper will help researchers, academics, and practitioners who are interested in machine learning, computer vision, and fashion retailing to understand the characteristics of the different fashion recommendation systems.

show abstract

Attentive Hierarchical Label Sharing for Enhanced Garment and Attribute Classification of Fashion Imagery

Papadopoulos

Koutlis

Sudheer³

et al. 2022

Lecture Notes in Electrical Engineering

View full text Add to dashboard Cite

Fine-grained information extraction from fashion imagery is a challenging task due to the inherent diversity and complexity of fashion categories and attributes. Additionally, fashion imagery often depict multiple items while fashion items tend to follow hierarchical relations among various object types, categories and attributes. In this study, we address both issues with a 2-step hierarchical deep learning pipeline consisting of (1) a low granularity object type detection module (upper-body, lower-body, full-body, footwear) and (2) two classification modules for garment categories and attributes based on the outcome of the first step. For the category and attribute-level classification stages we examine a hierarchical label sharing (HLS) technique in two settings: (1) single-task learning (STL w/ HLS) and ( 2) multi-task learning with RNN and visual attention (MTL w/ RNN+VA). Our approach enables progressively focusing on appropriately detailed features for automatically learning the hierarchical relations of fashion and enabling predictions on images with complete outfits. Empirically, STL w/ HLS reached 93.99% top-3 accuracy while MTL w/ RNN+VA reached 97.57% top-5 accuracy for category

show abstract

Who, Where, and What to Wear?

Cited by 25 publications

References 34 publications

Borrowing Human Senses: Comment-Aware Self-Training for Social Media Multimodal Classification

Borrowing Human Senses: Comment-Aware Self-Training for Social Media Multimodal Classification

Fashion Recommendation Systems, Models and Methods: A Review

Attentive Hierarchical Label Sharing for Enhanced Garment and Attribute Classification of Fashion Imagery

Contact Info

Product

Resources

About