Geoffrey Scoutheeten scite author profile

Geoffrey Scoutheeten

5Publications

82Citation Statements Received

310Citation Statements Given

How they've been cited

How they cite others

158

308

Affiliations

BNP Paribas (France)

Publications

Order By: Most citations

A Hierarchical Model for Data-to-Text Generation

Rebuffel

Soulier

Scoutheeten

et al. 2020

View full text Add to dashboard Cite

Transcribing structured data into natural language descriptions has emerged as a challenging task, referred to as "data-to-text". These structures generally regroup multiple elements, as well as their attributes. Most attempts rely on translation encoder-decoder methods which linearize elements into a sequence. This however loses most of the structure contained in the data. In this work, we propose to overpass this limitation with a hierarchical model that encodes the data-structure at the element-level and the structure level. Evaluations on RotoWire show the effectiveness of our model w.r.t. qualitative and quantitative metrics.

show abstract

Data-QuestEval: A Referenceless Metric for Data-to-Text Semantic Evaluation

Rebuffel¹,

Scialom²,

Soulier³

et al. 2021

View full text Add to dashboard Cite

QUESTEVAL is a reference-less metric used in text-to-text tasks, that compares the generated summaries directly to the source text, by automatically asking and answering questions. Its adaptation to Data-to-Text tasks is not straightforward as it requires multimodal Question Generation and Answering systems on the considered tasks, which are seldom available. To this purpose, we propose a method to build synthetic multimodal corpora enabling to train multimodal components for a data-QuestEval metric. The resulting metric is reference-less and multimodal; it obtains state-of-the-art correlations with human judgment on the WebNLG and WikiBio benchmarks. We make data-QUESTEVAL's code and models available for reproducibility purpose, as part of the QUESTEVAL project. 1

show abstract

Let’s Stop Incorrect Comparisons in End-to-end Relation Extraction!

Taillé¹,

Guigue²,

Scoutheeten³

et al. 2020

View full text Add to dashboard Cite

Despite efforts to distinguish three different evaluation setups (Bekoulis et al., 2018a,b), numerous end-to-end Relation Extraction (RE) articles present unreliable performance comparison to previous work. In this paper, we first identify several patterns of invalid comparisons in published papers and describe them to avoid their propagation. We then propose a small empirical study to quantify the most common mistake's impact and evaluate it leads to overestimating the final RE performance by around 5% on ACE05. We also seize this opportunity to study the unexplored ablations of two recent developments: the use of language model pretraining (specifically BERT) and span-level NER. This meta-analysis emphasizes the need for rigor in the report of both the evaluation setting and the dataset statistics. We finally call for unifying the evaluation setting in end-to-end RE 1 .

show abstract

Controlling Hallucinations at Word Level in Data-to-Text Generation

Rebuffel¹,

Roberti²,

Soulier³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

Data-QuestEval: A Referenceless Metric for Data-to-Text Semantic Evaluation

Rebuffel¹,

Scialom²,

Soulier³

et al. 2021

Preprint

View full text Add to dashboard Cite

In this paper, we explore how QUESTEVAL, which is a Text-vs-Text metric, can be adapted for the evaluation of Data-to-Text Generation systems. QUESTEVAL is a referenceless metric that compares the predictions directly to the structured input data by automatically asking and answering questions. Its adaptation to Data-to-Text is not straightforward as it requires multi-modal Question Generation and Answering (QG & QA) systems. To this purpose, we propose to build synthetic multi-modal corpora that enables to train multi-modal QG/QA. The resulting metric is reference-less, multi-modal; it obtains state-ofthe-art correlations with human judgement on the E2E and WebNLG benchmark. 1

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.