Our system is currently under heavy load due to increased usage. We're actively working on upgrades to improve performance. Thank you for your patience.
2018
DOI: 10.1609/aaai.v32i1.11331
|View full text |Cite
|
Sign up to set email alerts
|

Towards Building Large Scale Multimodal Domain-Aware Conversation Systems

Abstract: While multimodal conversation agents are gaining importance in several domains such as retail, travel etc., deep learning research in this area has been limited primarily due to the lack of availability of large-scale, open chatlogs. To overcome this bottleneck, in this paper we introduce the task of multimodal, domain-aware conversations, and propose the MMD benchmark dataset. This dataset was gathered by working in close coordination with large number of domain experts in the retail domain. These experts sug… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
36
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 71 publications
(57 citation statements)
references
References 8 publications
0
36
0
Order By: Relevance
“…Multimodal Dialogue. Multimodal dialogue has been actively studied in previous works (Das et al 2017;De Vries et al 2017;Mostafazadeh et al 2017;Saha, Khapra, and Sankaranarayanan 2017;Pasunuru and Bansal 2018;Alamri et al 2019;Haber et al 2019;Kim et al 2019;Moon et al 2020;Shuster et al 2020;Cheng et al 2020). Although all these works involve interesting task setups (question answering/generation, object discovery, shopping, collaborative drawing, response retrieval/generation, image identification/generation, etc.)…”
Section: Related Workmentioning
confidence: 99%
“…Multimodal Dialogue. Multimodal dialogue has been actively studied in previous works (Das et al 2017;De Vries et al 2017;Mostafazadeh et al 2017;Saha, Khapra, and Sankaranarayanan 2017;Pasunuru and Bansal 2018;Alamri et al 2019;Haber et al 2019;Kim et al 2019;Moon et al 2020;Shuster et al 2020;Cheng et al 2020). Although all these works involve interesting task setups (question answering/generation, object discovery, shopping, collaborative drawing, response retrieval/generation, image identification/generation, etc.)…”
Section: Related Workmentioning
confidence: 99%
“…Hence, we follow a similar line of work and propose to enrich fashion dialog agents with politeness. Saha et al (2018) introduced a large-scale multimodal fashion dialog dataset (MMD) built semi-automatically, using field experts, accompanied by two RNN (Cho et al, 2014) models capable of emulating the system responses in a multimodal scenario. Due to its domain, it carries mainly neutral and polite dialog.…”
Section: Related Workmentioning
confidence: 99%
“…Polite-RL uses the politeness score of a sampled utterance as a measure of politeness that acts as the Reinforcement Learning component of the loss function (see appendix A.4), to guide the generation towards a more polite output. We focused on improving the used embeddings to include a novel lexicon, given that the fashion domain (Saha et al, 2018) differs significantly from the training data (Danescu-Niculescu-Mizil et al, 2013), making out-of-vocabulary situations a major issue. Originally, this model uses embeddings initialized using a Word2Vec model trained on the Google News dataset (Mikolov et al, 2013).…”
Section: Politeness Through Utterance Generationmentioning
confidence: 99%
See 2 more Smart Citations