Pablo Accuosto scite author profile

Saggion

2019

In this work we propose to leverage resources available with discourse-level annotations to facilitate the identification of argumentative components and relations in scientific texts, which has been recognized as a particularly challenging task. In particular, we implement and evaluate a transfer learning approach in which contextualized representations learned from discourse parsing tasks are used as input of argument mining models. As a pilot application, we explore the feasibility of using automatically identified argumentative components and relations to predict the acceptance of papers in computer science venues. In order to conduct our experiments, we propose an annotation scheme for argumentative units and relations and use it to enrich an existing corpus with an argumentation layer. 1

Discourse-Driven Argument Mining in Scientific Abstracts

Saggion

2019

Argument mining consists in the automatic identification of argumentative structures in texts. In this work we address the open question of whether discourse-level annotations can contribute to facilitate the identification of argumentative components and relations in scientific literature. We conduct a pilot study by enriching a corpus of computational linguistics abstracts that contains discourse annotations with a new argumentative annotation level. The results obtained from preliminary experiments confirm the potential value of the proposed approach.

Mining arguments in scientific abstracts with discourse-level embeddings

Data & Knowledge Engineering

Saggion

2020

Cross-lingual semantic annotation of biomedical literature: experiments in Spanish and English

Pérez

Bravo

et al. 2019

Motivation Biomedical literature is one of the most relevant sources of information for knowledge mining in the field of Bioinformatics. In spite of English being the most widely addressed language in the field, in recent years there has been a growing interest from the natural language processing community in dealing with languages other than English. However, the availability of language resources and tools for appropriate treatment of non-English texts is lacking behind. Our research is concerned with the semantic annotation of biomedical texts in the Spanish language, which can be considered an under-resourced language where biomedical text processing is concerned. Results We have carried out experiments to assess the effectiveness of several methods for the automatic annotation of biomedical texts in Spanish. One approach is based on the linguistic analysis of Spanish texts and their annotation using an information retrieval and concept disambiguation approach. A second method takes advantage of a Spanish-English machine translation process to annotate English documents and transfer annotations back to Spanish. A third method takes advantage of the combination of both procedures. Our evaluation shows that a combined system has competitive advantages over the two individual procedures. Availability UMLSmapper (https://snlt.vicomtech.org/umlsmapper) and the annotation transfer tool (http://scientmin.taln.upf.edu/anntransfer) are freely available for research purposes as web services and/or demos. Supplementary information Supplementary data are available at Bioinformatics online.

Multi-level mining and visualization of scientific text collections

Ronzano

Ferrés

et al. 2017

We present a system to mine and visualize collections of scientific documents by semantically browsing information extracted from single publications or aggregated throughout corpora of articles. The text mining tool performs deep analysis of document collections allowing the extraction and interpretation of research paper's contents. In addition to the extraction and enrichment of documents with metadata (titles, authors, affiliations, etc), the deep analysis performed comprises semantic interpretation, rhetorical analysis of sentences, triple-based information extraction, and text summarization. The visualization components allow geographicalbased exploration of collections, topic-evolution interpretation, and collaborative network analysis among others. The paper presents a case study of a bilingual collection in the field of Natural Language Processing (NLP).