Investigating the Role of Argumentation in the Rhetorical Analysis of Scientific Publications with Neural Multi-Task Learning Models

Lauscher, Anne; Glavaš, Goran; Ponzetto, Simone Paolo; Eckert, Kai

doi:10.18653/v1/d18-1370

Cited by 18 publications

(15 citation statements)

References 47 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Domain knowledge is necessary to annotate scientific publications, and therefore annotation on scientific publications is difficult for the non-expert (e.g., crowdsourcing workers). Indeed, existing data sets for various tasks in scientific publication mining ( Lauscher et al, 2018 ; Hua et al, 2019 ; Yang et al, 2019 ; Yasunaga et al, 2019 ) are limited in terms of size, which additionally suggests that obtaining a sufficient number of data for supervised machine learning on scientific text is expensive and time-consuming. To remedy this bottleneck of annotation cost, we propose a self-supervised approach in which we use direct inline figure references in the article body to heuristically pair article paragraphs with figure captions and use those pairs as distant supervision.…”

Section: Related Workmentioning

confidence: 99%

Visual Summary Identification From Scientific Publications via Self-Supervised Learning

Yamamoto

Lauscher

Ponzetto

et al. 2021

Front. Res. Metr. Anal.

Self Cite

View full text Add to dashboard Cite

The exponential growth of scientific literature yields the need to support users to both effectively and efficiently analyze and understand the some body of research work. This exploratory process can be facilitated by providing graphical abstracts–a visual summary of a scientific publication. Accordingly, previous work recently presented an initial study on automatic identification of a central figure in a scientific publication, to be used as the publication’s visual summary. This study, however, have been limited only to a single (biomedical) domain. This is primarily because the current state-of-the-art relies on supervised machine learning, typically relying on the existence of large amounts of labeled data: the only existing annotated data set until now covered only the biomedical publications. In this work, we build a novel benchmark data set for visual summary identification from scientific publications, which consists of papers presented at conferences from several areas of computer science. We couple this contribution with a new self-supervised learning approach to learn a heuristic matching of in-text references to figures with figure captions. Our self-supervised pre-training, executed on a large unlabeled collection of publications, attenuates the need for large annotated data sets for visual summary identification and facilitates domain transfer for this task. We evaluate our self-supervised pretraining for visual summary identification on both the existing biomedical and our newly presented computer science data set. The experimental results suggest that the proposed method is able to outperform the previous state-of-the-art without any task-specific annotations.

show abstract

Section: Related Workmentioning

confidence: 99%

Visual Summary Identification From Scientific Publications via Self-Supervised Learning

Yamamoto

Lauscher

Ponzetto

et al. 2021

Front. Res. Metr. Anal.

Self Cite

View full text Add to dashboard Cite

show abstract

“…Green (2017b) extracted argumentative units from biomedical and biological articles using a semantic rule-based approach. Lauscher et al (2018a) and Lauscher et al (2018c) proposed several neural multi-task learning models based on Bi-LSTM to identify premises and conclusions. Other papers propose different approaches to identify argumentative zones, including supervised and weakly-supervised approaches with a rich set of linguistics features (e.g., (Guo et al, 2011)).…”

Section: Automatic Argument Unit Identificationmentioning

confidence: 99%

Proceedings of the Second Workshop on Scholarly Document Processing

2021

View full text Add to dashboard Cite

Most work on scholarly document processing assumes that the information processed is trustworthy and factually correct. However, this is not always the case. There are two core challenges, which should be addressed: 1) ensuring that scientific publications are credible -e.g. that claims are not made without supporting evidence, and that all relevant supporting evidence is provided; and 2) that scientific findings are not misrepresented, distorted or outright misreported when communicated by journalists or the general public. I will present some first steps towards addressing these problems and outline remaining challenges. BiologyWood Frogs (Rana sylvatica) are a charismatic species of frog common in much of North America. They breed in explosive choruses over a few nights in late winter to early spring. The incidence in Wood Frogs was associated with a die-off of frogs during the breeding chorus in the Sylamore District of the Ozark National Forest in Arkansas (Trauth et al., 2000). Computer ScienceLand use or cover change is a direct reflection of human activity, such as land use, urban expansion, and architectural planning, on the earth's surface caused by urbanization [1]. Remote sensing images are important data sources that can efficiently detect land changes. Meanwhile, remote sensing image-based change detection is the change identification of surficial objects or geographic phenomena through the remote observation of two or more different phases [2].

show abstract

“…The usefulness of deep networks has been tested and proven in many NLP tasks, such as machine translation (Young et al, 2018 ), sentiment analysis (Zhang et al, 2018a ), text classification (Conneau et al, 2017 ; Zhang et al, 2018b ), relations extraction (Huang and Wang, 2017 ), as well as in AM (Cocarascu and Toni, 2017 , 2018 ; Daxenberger et al, 2017 ; Galassi et al, 2018 ; Lauscher et al, 2018 ; Lugini and Litman, 2018 ; Schulz et al, 2018 ). While a straightforward approach to exploit domain knowledge in AM is to apply a set of hand-crafted rules on the output of some first stage classifier (such as a neural network), NeSy or SRL approaches can directly enforce (hard or soft) constraints during training , so that a solution that does not satisfy them is penalized, or even ruled out.…”

Section: Combining Symbolic and Sub-symbolic Approachesmentioning

confidence: 99%

Neural-Symbolic Argumentation Mining: An Argument in Favor of Deep Learning and Reasoning

Galassi¹,

Kersting²,

Lippi³

et al. 2020

Front. Big Data

View full text Add to dashboard Cite

show abstract

Investigating the Role of Argumentation in the Rhetorical Analysis of Scientific Publications with Neural Multi-Task Learning Models

Cited by 18 publications

References 47 publications

Visual Summary Identification From Scientific Publications via Self-Supervised Learning

Visual Summary Identification From Scientific Publications via Self-Supervised Learning

Proceedings of the Second Workshop on Scholarly Document Processing

Neural-Symbolic Argumentation Mining: An Argument in Favor of Deep Learning and Reasoning

Contact Info

Product

Resources

About