Damien Sileo scite author profile

Damien Sileo

5Publications

53Citation Statements Received

180Citation Statements Given

How they've been cited

How they cite others

137

180

Affiliations

KU Leuven

Publications

Order By: Most citations

Mining Discourse Markers for Unsupervised Sentence Representation Learning

Sileo¹,

Cruys²,

Pradel³

et al. 2019

View full text Add to dashboard Cite

Current state of the art systems in NLP heavily rely on manually annotated datasets, which are expensive to construct. Very little work adequately exploits unannotated data -such as discourse markers between sentences -mainly because of data sparseness and ineffective extraction methods. In the present work, we propose a method to automatically discover sentence pairs with relevant discourse markers, and apply it to massive amounts of data. Our resulting dataset contains 174 discourse markers with at least 10K examples each, even for rare markers such as coincidentally or amazingly. We use the resulting data as supervision for learning transferable sentence embeddings. In addition, we show that even though sentence representation learning through prediction of discourse markers yields state of the art results across different transfer tasks, it is not clear that our models made use of the semantic relation between sentences, thus leaving room for further improvements. Our datasets are publicly available 1

show abstract

Zero-Shot Recommendation as Language Modeling

Sileo

Vossen

Raymaekers

2022

View full text Add to dashboard Cite

NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation

Dhole¹,

Gangal²,

Gehrmann³

et al. 2021

Preprint

View full text Add to dashboard Cite

Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on. In this paper, we present NL-Augmenter, a new participatory Pythonbased natural language augmentation framework which supports the creation of both transformations (modifications to the data) and filters (data splits according to specific features). We describe the framework and an initial set of 117 transformations and 23 filters for a variety of natural language tasks. We demonstrate the efficacy of NL-Augmenter by using several of its tranformations to analyze the robustness of popular natural language models. The infrastructure, datacards and robutstness analysis results are available publicly on the NL-Augmenter repository (https://github. com/GEM-benchmark/NL-Augmenter).

show abstract

Visual Grounding Strategies for Text-Only Natural Language Processing

Sileo¹

2021

Preprint

View full text Add to dashboard Cite

Visual grounding is a promising path toward more robust and accurate Natural Language Processing (NLP) models. Many multimodal extensions of BERT (e.g., VideoBERT, LXMERT, VL-BERT) allow a joint modeling of texts and images that lead to state-of-theart results on multimodal tasks such as Visual Question Answering. Here, we leverage multimodal modeling for purely textual tasks (language modeling and classification) with the expectation that the multimodal pretraining provides a grounding that can improve text processing accuracy. We propose possible strategies in this respect. A first type of strategy, referred to as transferred grounding consists in applying multimodal models to text-only tasks using a placeholder to replace image input. The second one, which we call associative grounding, harnesses image retrieval to match texts with related images during both pretraining and text-only downstream tasks. We draw further distinctions into both strategies and then compare them according to their impact on language modeling and commonsenserelated downstream tasks, showing improvement over text-only baselines.

show abstract

Composition of Sentence Embeddings: Lessons from Statistical Relational Learning

Sileo¹,

Cruys²,

Pradel³

et al. 2019

View full text Add to dashboard Cite

Various NLP problems -such as the prediction of sentence similarity, entailment, and discourse relations -are all instances of the same general task: the modeling of semantic relations between a pair of textual elements. A popular model for such problems is to embed sentences into fixed size vectors, and use composition functions (e.g. concatenation or sum) of those vectors as features for the prediction. At the same time, composition of embeddings has been a main focus within the field of Statistical Relational Learning (SRL) whose goal is to predict relations between entities (typically from knowledge base triples). In this article, we show that previous work on relation prediction between texts implicitly uses compositions from baseline SRL models. We show that such compositions are not expressive enough for several tasks (e.g. natural language inference). We build on recent SRL models to address textual relational problems, showing that they are more expressive, and can alleviate issues from simpler compositions. The resulting models significantly improve the state of the art in both transferable sentence representation learning and relation prediction.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Damien Sileo

Mining Discourse Markers for Unsupervised Sentence Representation Learning

Zero-Shot Recommendation as Language Modeling

NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation

Visual Grounding Strategies for Text-Only Natural Language Processing

Composition of Sentence Embeddings: Lessons from Statistical Relational Learning

Contact Info

Product

Resources

About