How Am I Doing?: Evaluating Conversational Search Systems Offline

Lipani, Aldo; Carterette, Ben; Yılmaz, Emine

doi:10.1145/3451160

Cited by 36 publications

(23 citation statements)

References 32 publications

(37 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…User simulation has been widely leveraged in the past for training the dialogue state tracking component of conversational agents using reinforcement learning algorithms, either via agenda-based or model-based simulation [19]. The highly interactive nature of conversational information access systems has also sparked renewed interest in evaluation using user simulation within the IR community [4,5,23,36,38,53]. Recently, Zhang and Balog [53] proposed a general framework for evaluating conversational recommender systems using user simulation.…”

Section: Discussionmentioning

confidence: 99%

Analyzing and Simulating User Utterance Reformulation in Conversational Recommender Systems

Zhang,

Wang,

Balog

2022

Preprint

View full text Add to dashboard Cite

User simulation has been a cost-effective technique for evaluating conversational recommender systems. However, building a humanlike simulator is still an open challenge. In this work, we focus on how users reformulate their utterances when a conversational agent fails to understand them. First, we perform a user study, involving five conversational agents across different domains, to identify common reformulation types and their transition relationships. A common pattern that emerges is that persistent users would first try to rephrase, then simplify, before giving up. Next, to incorporate the observed reformulation behavior in a user simulator, we introduce the task of reformulation sequence generation: to generate a sequence of reformulated utterances with a given intent (rephrase or simplify). We develop methods by extending transformer models guided by the reformulation type and perform further filtering based on estimated reading difficulty. We demonstrate the effectiveness of our approach using both automatic and human evaluation. CCS CONCEPTS• Information systems → Users and interactive retrieval.

show abstract

Section: Discussionmentioning

confidence: 99%

Analyzing and Simulating User Utterance Reformulation in Conversational Recommender Systems

Zhang,

Wang,

Balog

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…It is important to note that it is also possible to succeed while leaving users frustrated, as studied by Feild et al (2010). A particular end-to-end evaluation approach was recently presented by Lipani et al (2021), based on the flow of different subtopics within a conversation.…”

Section: Metrics For End-to-end Evaluationmentioning

confidence: 99%

Conversational Information Seeking

Zamani¹,

Trippas²,

Dalton³

et al. 2022

Preprint

View full text Add to dashboard Cite

Conversational information seeking (CIS) is concerned with a sequence of interactions between one or more users and an information system. Interactions in CIS are primarily based on natural language dialogue, while they may include other types of interactions, such as click, touch, and body gestures. This monograph provides a thorough overview of CIS definitions, applications, interactions, interfaces, design, implementation, and evaluation. This monograph views CIS applications as including conversational search, conversational question answering, and conversational recommendation. Our aim is to provide an overview of past research related to CIS, introduce the current state-of-the-art in CIS, highlight the challenges still being faced in the community. and suggest future directions.

show abstract

“…For this reason, researchers adopt human-in-the-loop techniques to mimic human-computer interactions, and further perform human annotation to evaluate the whole system's performance (in response to human). Recent work of Lipani et al [30] propose a metric for offline evaluation of conversational search systems based on user interaction model.…”

Section: Related Workmentioning

confidence: 99%

Evaluating Mixed-initiative Conversational Search Systems via User Simulation

Sekulić

Aliannejadi

Crestani

2022

Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining

View full text Add to dashboard Cite

Clarifying the underlying user information need by asking clarifying questions is an important feature of modern conversational search system. However, evaluation of such systems through answering prompted clarifying questions requires significant human effort, which can be time-consuming and expensive. In this paper, we propose a conversational User Simulator, called USi, for automatic evaluation of such conversational search systems. Given a description of an information need, USi is capable of automatically answering clarifying questions about the topic throughout the search session. Through a set of experiments, including automated natural language generation metrics and crowdsourcing studies, we show that responses generated by USi are both inline with the underlying information need and comparable to humangenerated answers. Moreover, we make the first steps towards multiturn interactions, where conversational search systems asks multiple questions to the (simulated) user with a goal of clarifying the user need. To this end, we expand on currently available datasets for studying clarifying questions, i.e., Qulac and ClariQ, by performing a crowdsourcing-based multi-turn data acquisition. We show that our generative, GPT2-based model, is capable of providing accurate and natural answers to unseen clarifying questions in the single-turn setting and discuss capabilities of our model in the multi-turn setting. We provide the code, data, and the pre-trained model to be used for further research on the topic. 1

show abstract

How Am I Doing?: Evaluating Conversational Search Systems Offline

Cited by 36 publications

References 32 publications

Analyzing and Simulating User Utterance Reformulation in Conversational Recommender Systems

Analyzing and Simulating User Utterance Reformulation in Conversational Recommender Systems

Conversational Information Seeking

Evaluating Mixed-initiative Conversational Search Systems via User Simulation

Contact Info

Product

Resources

About