Ander Barrena scite author profile

Ander Barrena

5Publications

59Citation Statements Received

55Citation Statements Given

How they've been cited

How they cite others

116

Affiliations

University of the Basque Country, Basque Center on Cognition, Brain and Language

Publications

Order By: Most citations

Label Verbalization and Entailment for Effective Zero and Few-Shot Relation Extraction

Sainz¹,

Lacalle²,

Labaka³

et al. 2021

View full text Add to dashboard Cite

Relation extraction systems require large amounts of labeled examples which are costly to annotate. In this work we reformulate relation extraction as an entailment task, with simple, hand-made, verbalizations of relations produced in less than 15 minutes per relation. The system relies on a pretrained textual entailment engine which is run as-is (no training examples, zero-shot) or further fine-tuned on labeled examples (few-shot or fully trained). In our experiments on TACRED we attain 63% F1 zero-shot, 69% with 16 examples per relation (17% points better than the best supervised system on the same conditions), and only 4 points short of the state-of-the-art (which uses 20 times more training data). We also show that the performance can be improved significantly with larger entailment models, up to 12 points in zero-shot, giving the best results to date on TACRED when fully trained. The analysis shows that our few-shot systems are especially effective when discriminating between relations, and that the performance difference in low data regimes comes mainly from identifying no-relation cases.

show abstract

Combining Mention Context and Hyperlinks from Wikipedia for Named Entity Disambiguation

Barrena

Soroa

Agirre

2015

View full text Add to dashboard Cite

Named entity disambiguation is the task of linking entity mentions to their intended referent, as represented in a Knowledge Base, usually derived from Wikipedia. In this paper, we combine local mention context and global hyperlink structure from Wikipedia in a probabilistic framework. Our results show that the two models of context, namely, words in the context and hyperlink pathways to other entities in the context, are complementary. We test our method in eight datasets, improving the state-of-the-art results in five, without any tuning, showing that it is robust to out-ofdomain scenarios. When tuning combination weights, we match the best reported results on the widely-used AIDA-CoNLL test-b.

show abstract

Alleviating Poor Context with Background Knowledge for Named Entity Disambiguation

Barrena¹,

Soroa²,

Agirre³

2016

View full text Add to dashboard Cite

Named Entity Disambiguation (NED) algorithms disambiguate mentions of named entities with respect to a knowledge-base, but sometimes the context might be poor or misleading. In this paper we introduce the acquisition of two kinds of background information to alleviate that problem: entity similarity and selectional preferences for syntactic positions. We show, using a generative Näive Bayes model for NED, that the additional sources of context are complementary, and improve results in the CoNLL 2003 and TAC KBP DEL 2014 datasets, yielding the third best and the best results, respectively. We provide examples and analysis which show the value of the acquired background information.

show abstract

One Way or Another: Cortical Language Areas Flexibly Adapt Processing Strategies to Perceptual And Contextual Properties of Speech

Klimovich-Gray

Barrena

Agirre

et al. 2021

View full text Add to dashboard Cite

Cortical circuits rely on the temporal regularities of speech to optimize signal parsing for sound-to-meaning mapping. Bottom-up speech analysis is accelerated by top–down predictions about upcoming words. In everyday communications, however, listeners are regularly presented with challenging input—fluctuations of speech rate or semantic content. In this study, we asked how reducing speech temporal regularity affects its processing—parsing, phonological analysis, and ability to generate context-based predictions. To ensure that spoken sentences were natural and approximated semantic constraints of spontaneous speech we built a neural network to select stimuli from large corpora. We analyzed brain activity recorded with magnetoencephalography during sentence listening using evoked responses, speech-to-brain synchronization and representational similarity analysis. For normal speech theta band (6.5–8 Hz) speech-to-brain synchronization was increased and the left fronto-temporal areas generated stronger contextual predictions. The reverse was true for temporally irregular speech—weaker theta synchronization and reduced top–down effects. Interestingly, delta-band (0.5 Hz) speech tracking was greater when contextual/semantic predictions were lower or if speech was temporally jittered. We conclude that speech temporal regularity is relevant for (theta) syllabic tracking and robust semantic predictions while the joint support of temporal and contextual predictability reduces word and phrase-level cortical tracking (delta).

show abstract

Give your Text Representation Models some Love: the Case for Basque

Agerri¹,

Vicente²,

Campos³

et al. 2020

Preprint

View full text Add to dashboard Cite

Word embeddings and pre-trained language models allow to build rich representations of text and have enabled improvements across most NLP tasks. Unfortunately they are very expensive to train, and many small companies and research groups tend to use models that have been pre-trained and made available by third parties, rather than building their own. This is suboptimal as, for many languages, the models have been trained on smaller (or lower quality) corpora. In addition, monolingual pre-trained models for non-English languages are not always available. At best, models for those languages are included in multilingual versions, where each language shares the quota of substrings and parameters with the rest of the languages. This is particularly true for smaller languages such as Basque. In this paper we show that a number of monolingual models (FastText word embeddings, FLAIR and BERT language models) trained with larger Basque corpora produce much better results than publicly available versions in downstream NLP tasks, including topic classification, sentiment classification, PoS tagging and NER. This work sets a new state-of-the-art in those tasks for Basque. All benchmarks and models used in this work are publicly available.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ander Barrena

Label Verbalization and Entailment for Effective Zero and Few-Shot Relation Extraction

Combining Mention Context and Hyperlinks from Wikipedia for Named Entity Disambiguation

Alleviating Poor Context with Background Knowledge for Named Entity Disambiguation

One Way or Another: Cortical Language Areas Flexibly Adapt Processing Strategies to Perceptual And Contextual Properties of Speech

Give your Text Representation Models some Love: the Case for Basque

Contact Info

Product

Resources

About