Amanda Cercas Curry scite author profile

The majority of NLG evaluation relies on automatic metrics, such as BLEU. In this paper, we motivate the need for novel, system-and data-independent automatic evaluation methods: We investigate a wide range of metrics, including state-of-the-art word-based and novel grammar-based ones, and demonstrate that they only weakly reflect human judgements of system outputs as generated by data-driven, end-to-end NLG. We also show that metric performance is data-and system-specific. Nevertheless, our results also suggest that automatic metrics perform reliably at system-level and can support system development by finding cases where a system performs poorly.

show abstract

Alexa, Google, Siri: What are Your Pronouns? Gender and Anthropomorphism in the Design and Perception of Conversational Assistants

Abercrombie¹,

Curry²,

Pandya³

et al. 2021

View full text Add to dashboard Cite

Technology companies have produced varied responses to concerns about the effects of the design of their conversational AI systems. Some have claimed that their voice assistants are in fact not gendered or human-likedespite design features suggesting the contrary. We compare these claims to user perceptions by analysing the pronouns they use when referring to AI assistants. We also examine systems' responses and the extent to which they generate output which is gendered and anthropomorphic. We find that, while some companies appear to be addressing the ethical concerns raised, in some cases, their claims do not seem to hold true. In particular, our results show that system outputs are ambiguous as to the humanness of the systems, and that users tend to personify and gender them as a result.

show abstract

#MeToo Alexa: How Conversational Systems Respond to Sexual Harassment

Curry¹,

Rieser²

2018

View full text Add to dashboard Cite

Conversational AI systems are rapidly developing from purely transactional systems to social chatbots, which can respond to a wide variety of user requests. In this article, we establish how current state-of-the-art conversational systems react to inappropriate requests, such as bullying and sexual harassment on the part of the user, by collecting and analysing the novel #MeToo corpus. Our results show that commercial systems mainly avoid answering, while rule-based chatbots show a variety of behaviours and often deflect. Data-driven systems, on the other hand, are often noncoherent, but also run the risk of being interpreted as flirtatious and sometimes react with counter-aggression. This includes our own system, trained on "clean" data, which suggests that inappropriate system behaviour is not caused by data bias. 1 goo.gl/qRSvxv 2 Disclaimer: This paper contains examples which some readers may find disturbing.

show abstract

A review of evaluation techniques for social dialogue systems

Curry

Hastie

Rieser

2017

View full text Add to dashboard Cite

In contrast with goal-oriented dialogue, social dialogue has no clear measure of task success. Consequently, evaluation of these systems is notoriously hard. In this paper, we review current evaluation methods, focusing on automatic metrics. We conclude that turnbased metrics often ignore the context and do not account for the fact that several replies are valid, while end-of-dialogue rewards are mainly hand-crafted. Both lack grounding in human perceptions.

show abstract

Alexa, Google, Siri: What are Your Pronouns? Gender and Anthropomorphism in the Design and Perception of Conversational Assistants

Abercrombie¹,

Curry²,

Pandya³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.