ExplainaBoard: An Explainable Leaderboard for NLP

Liu, Pengfei; Fu, Jinlan; Xiao, Yang; Yuan, Weizhe; Chang, Shuaichen; Dai, Junqi; Liu, Yixin; Ye, Zihuiwen; Neubig, Graham

doi:10.18653/v1/2021.acl-demo.34

Cited by 19 publications

(17 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We strongly advocate for better methods to assess the capability of models for numerical reasoning. One such direction could be akin to Linzen (2020) who proposes a parallel evaluation paradigm that rewards models for possessing human-like generalization capabilities and Liu et al (2021) that augments current leaderboards with three extra dimensions of interpretability, interactivity, and reliability. We highly recommend for careful design of the benchmarks and better leaderboards to correctly measure progress in such complex tasks.…”

Section: Discussionmentioning

confidence: 99%

Numerical reasoning in machine reading comprehension tasks: are we there yet?

Al-Negheimish¹,

Madhyastha²,

Russo³

2021

Preprint

View full text Add to dashboard Cite

Numerical reasoning based machine reading comprehension is a task that involves reading comprehension along with using arithmetic operations such as addition, subtraction, sorting, and counting. The DROP benchmark (Dua et al., 2019) is a recent dataset that has inspired the design of NLP models aimed at solving this task. The current standings of these models in the DROP leaderboard, over standard metrics, suggest that the models have achieved near-human performance. However, does this mean that these models have learned to reason? In this paper, we present a controlled study on some of the top-performing model architectures for the task of numerical reasoning. Our observations suggest that the standard metrics are incapable of measuring progress towards such tasks.

show abstract

Section: Discussionmentioning

confidence: 99%

Numerical reasoning in machine reading comprehension tasks: are we there yet?

Al-Negheimish¹,

Madhyastha²,

Russo³

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…TextBox (Li et al 2021), an open-source library for text generation, provides a comprehensive and efficient framework for reproducing and developing text generation algorithms. Liu et al (2021) has released ExplainaBoard, which is a unified platform to evaluate interpretable, interactive and reliable capabilities of NLP systems. Photon (Zeng et al 2020) and DIALOGPT (Zhang et al 2020d) are two comprehensive systems for cross-domain text-to-SQL and conversational response generation tasks, respectively.…”

Section: Related Workmentioning

confidence: 99%

MWPToolkit: An Open-Source Framework for Deep Learning-Based Math Word Problem Solvers

Yihuai¹,

Wang²,

Lan³

et al. 2022

AAAI

View full text Add to dashboard Cite

While Math Word Problem (MWP) solving has emerged as a popular field of study and made great progress in recent years, most existing methods are benchmarked solely on one or two datasets and implemented with different configurations. In this paper, we introduce the first open-source library for solving MWPs called MWPToolkit, which provides a unified, comprehensive, and extensible framework for the research purpose. Specifically, we deploy 17 deep learning-based MWP solvers and 6 MWP datasets in our toolkit. These MWP solvers are advanced models for MWP solving, covering the categories of Seq2seq, Seq2Tree, Graph2Tree, and Pre-trained Language Models. And these MWP datasets are popular datasets that are commonly used as benchmarks in existing work. Our toolkit is featured with highly modularized and reusable components, which can help researchers quickly get started and develop their own models. We have released the code and documentation of MWPToolkit in https://github.com/LYH-YF/MWPToolkit.

show abstract

“…KYD (Google, 2021) also provides a web platform for data analysis but it mainly focuses on image data. ExplainaBoard (Liu et al, 2021a) presents an analysis platform while it focuses on system diagnostics.…”

Section: Related Workmentioning

confidence: 99%

DataLab: A Platform for Data Analysis and Intervention

Xiao¹,

Fu²,

Yuan³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

Despite data's crucial role in machine learning, most existing tools and research tend to focus on systems on top of existing data rather than how to interpret and manipulate data. In this paper, we propose DATALAB, a unified data-oriented platform that not only allows users to interactively analyze the characteristics of data, but also provides a standardized interface for different data processing operations. Additionally, in view of the ongoing proliferation of datasets, DATALAB has features for dataset recommendation and global vision analysis that help researchers form a better view of the data ecosystem. So far, DATALAB covers 1,715 datasets and 3,583 of its transformed version (e.g., hyponyms replacement ), where 728 datasets support various analyses (e.g., with respect to gender bias) with the help of 140M samples annotated by 318 feature functions. DATALAB is under active development and will be supported going forward. We have released a web platform, 1 web API, Python SDK, 2 , PyPI 3 published package and online documentation, 4 which hopefully, can meet the diverse needs of researchers.

show abstract

ExplainaBoard: An Explainable Leaderboard for NLP

Cited by 19 publications

References 31 publications

Numerical reasoning in machine reading comprehension tasks: are we there yet?

Numerical reasoning in machine reading comprehension tasks: are we there yet?

MWPToolkit: An Open-Source Framework for Deep Learning-Based Math Word Problem Solvers

DataLab: A Platform for Data Analysis and Intervention

Contact Info

Product

Resources

About