Siva Sankalp Patel scite author profile

Disentangling conversations mixed together in a single stream of messages is a difficult task, made harder by the lack of large manually annotated datasets. We created a new dataset of 77,563 messages manually annotated with reply-structure graphs that both disentangle conversations and define internal conversation structure. Our dataset is 16 times larger than all previously released datasets combined, the first to include adjudication of annotation disagreements, and the first to include context. We use our data to re-examine prior work, in particular, finding that 80% of conversations in a widely used dialogue corpus are either missing messages or contain extra messages. Our manually-annotated data presents an opportunity to develop robust data-driven methods for conversation disentanglement, which will help advance dialogue research.

show abstract

doc2dial: A Goal-Oriented Document-Grounded Dialogue Dataset

Feng¹,

Wan²,

Gunasekara³

et al. 2020

View full text Add to dashboard Cite

We introduce doc2dial, a new dataset of goal-oriented dialogues that are grounded in the associated documents. Inspired by how the authors compose documents for guiding end users, we first construct dialogue flows based on the content elements that corresponds to higher-level relations across text sections as well as lower-level relations between discourse units within a section. Then we present these dialogue flows to crowd contributors to create conversational utterances. The dataset includes over 4500 annotated conversations with an average of 14 turns that are grounded in over 450 documents from four domains. Compared to the prior document-grounded dialogue datasets, this dataset covers a variety of dialogue scenes in information-seeking conversations. For evaluating the versatility of the dataset, we introduce multiple dialogue modeling tasks and present baseline approaches.A9: Would you like to find out whether you are eligible? U10: That's exactly why I contact again! A11: Were there any damages to your clothes that were caused by prosthetic or orthopedic device or your skin medicine? U12: The latter happened.

show abstract

Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks

Kapanipathi¹,

Thost²,

Patel³

et al. 2020

AAAI

View full text Add to dashboard Cite

Textual entailment is a fundamental task in natural language processing. Most approaches for solving this problem use only the textual content present in training data. A few approaches have shown that information from external knowledge sources like knowledge graphs (KGs) can add value, in addition to the textual content, by providing background knowledge that may be critical for a task. However, the proposed models do not fully exploit the information in the usually large and noisy KGs, and it is not clear how it can be effectively encoded to be useful for entailment. We present an approach that complements text-based entailment models with information from KGs by (1) using Personalized PageRank to generate contextual subgraphs with reduced noise and (2) encoding these subgraphs using graph convolutional networks to capture the structural and semantic information in KGs. We evaluate our approach on multiple textual entailment datasets and show that the use of external knowledge helps the model to be robust and improves prediction accuracy. This is particularly evident in the challenging BreakingNLI dataset, where we see an absolute improvement of 5-20% over multiple text-based entailment models.

show abstract

MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents

Feng¹,

Patel²,

Wan³

et al. 2021

View full text Add to dashboard Cite

We propose MultiDoc2Dial, a new task and dataset on modeling goal-oriented dialogues grounded in multiple documents. Most previous works treat document-grounded dialogue modeling as a machine reading comprehension task based on a single given document or passage. In this work, we aim to address more realistic scenarios where a goaloriented information-seeking conversation involves multiple topics, and hence is grounded on different documents. To facilitate such a task, we introduce a new dataset that contains dialogues grounded in multiple documents from four different domains. We also explore modeling the dialogue-based and documentbased context in the dataset. We present strong baseline approaches and various experimental results, aiming to support further research efforts on such a task. Social Security CreditsYou must earn at least 40 Social Security credits to qualify for social security benefits. Number of Credit Needed for Disability BenefitsTo be eligible for disability benefits, you must meet a recent work test and a duration work test. Number of CreditNeeded for Retirement Benefits If you are born after 1928, you will need 40 credits to qualify for retirement benefits. 30 years or older -In general, you must have at least 20 credits in the 10-year period immediately before you become disabled. U1: I need help with SSDI. I heard that it could benefit my relatives too. I am in my 50s. A2: Yes SSDI pays benefits to you and family members if you are insured. A3: Do you know if you are "insured"? U4: Could you tell me more about it? A5: We measure it in "work credits". To be eligible for disability benefits, you must meet a recent work test. U6: How many credits do I need to get the benefit? A7: Since you are over 31 years old, you must have at least 20 credits in the 10-year period … U8: OK. My wife is currently unemployed. I want to know what benefit she gets from me. A9: The qualifying member could receive up to 50% of your benefit. Access Your Benefit Information Online Sign up a new account To sign up an new account Recover your username and password If you can't log in your account, you can fill out this form to recover your account information. If you can't log in your account, you can fill out this form to recover your account information.

show abstract

Agent Assist through Conversation Analysis

Fadnis

Mills

Ganhotra

et al. 2020

View full text Add to dashboard Cite

Customer support agents play a crucial role as an interface between an organization and its end-users. We propose CAIRAA: Conversational Approach to Information Retrieval for Agent Assistance, to reduce the cognitive workload of support agents who engage with users through conversation systems. CAIRAA monitors an evolving conversation and recommends both responses and URLs of documents the agent can use in replies to their client. We combine traditional information retrieval (IR) approaches with more recent Deep Learning (DL) models to ensure high accuracy and efficient run-time performance in the deployed system. Here, we describe the CAIRAA system and demonstrate its effectiveness in a pilot study via a short video 1 .

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.