Malvina Nissim scite author profile

We compare two ways of obtaining lexical knowledge for antecedent selection in other-anaphora and definite noun phrase coreference. Specifically, we compare an algorithm that relies on links encoded in the manually created lexical hierarchy WordNet and an algorithm that mines corpora by means of shallow lexico-semantic patterns. As corpora we use the British National Corpus (BNC), as well as the Web, which has not been previously used for this task. Our results show that (a) the knowledge encoded in WordNet is often insufficient, especially for anaphorantecedent relations that exploit subjective or context-dependent knowledge; (b) for otheranaphora, the Web-based method outperforms the WordNet-based method; (c) for definite NP coreference, the Web-based method yields results comparable to those obtained using WordNet over the whole data set and outperforms the WordNet-based method on subsets of the data set; (d) in both case studies, the BNC-based method is worse than the other methods because of data sparseness. Thus, in our studies, the Web-based method alleviated the lexical knowledge gap often encountered in anaphora resolution and handled examples with context-dependent relations between anaphor and antecedent. Because it is inexpensive and needs no hand-modeling of lexical knowledge, it is a promising knowledge source to integrate into anaphora resolution systems.

show abstract

The Meaning Factory: Formal Semantics for Recognizing Textual Entailment and Determining Semantic Similarity

Bjerva¹,

Bos²,

Goot³

et al. 2014

View full text Add to dashboard Cite

Shared Task 1 of SemEval-2014 comprised two subtasks on the same dataset of sentence pairs: recognizing textual entailment and determining textual similarity. We used an existing system based on formal semantics and logical inference to participate in the first subtask, reaching an accuracy of 82%, ranking in the top 5 of more than twenty participating systems. For determining semantic similarity we took a supervised approach using a variety of features, the majority of which was produced by our system for recognizing textual entailment. In this subtask our system achieved a mean squared error of 0.322, the best of all participating systems.

show abstract

Adding Semantics to Data-Driven Paraphrasing

Pavlick¹,

Bos²,

Nissim³

et al. 2015

View full text Add to dashboard Cite

We add an interpretable semantics to the paraphrase database (PPDB). To date, the relationship between phrase pairs in the database has been weakly defined as approximately equivalent. We show that these pairs represent a variety of relations, including directed entailment (little girl/girl) and exclusion (nobody/someone). We automatically assign semantic entailment relations to entries in PPDB using features derived from past work on discovering inference rules from text and semantic taxonomy induction. We demonstrate that our model assigns these relations with high accuracy. In a downstream RTE task, our labels rival relations from WordNet and improve the coverage of a proof-based RTE system by 17%.

show abstract

Exploring the boundaries: gene and protein identification in biomedical text

et al. 2005

View full text Add to dashboard Cite

show abstract

Syntactic features and word similarity for supervised metonymy resolution

Nissim

Markert

2003

View full text Add to dashboard Cite

We present a supervised machine learning algorithm for metonymy resolution, which exploits the similarity between examples of conventional metonymy. We show that syntactic head-modifier relations are a high precision feature for metonymy recognition but suffer from data sparseness. We partially overcome this problem by integrating a thesaurus and introducing simpler grammatical features, thereby preserving precision and increasing recall. Our algorithm generalises over two levels of contextual similarity. Resulting inferences exceed the complexity of inferences undertaken in word sense disambiguation. We also compare automatic and manual methods for syntactic feature extraction.

show abstract

Thank you BART! Rewarding Pre-Trained Models Improves Formality Style Transfer

Lai

Toral

Nissim

2021

View full text Add to dashboard Cite

Scarcity of parallel data causes formality style transfer models to have scarce success in preserving content. We show that fine-tuning pre-trained language (GPT-2) and sequenceto-sequence (BART) models boosts content preservation, and that this is possible even with limited amounts of parallel data. Augmenting these models with rewards that target style and content -the two core aspects of the task-we achieve a new state-of-the-art.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Malvina Nissim

Exploiting context for biomedical entity recognition

Overview of the Evalita 2016 SENTIment POLarity Classification Task

Comparing Knowledge Sources for Nominal Anaphora Resolution

The Meaning Factory: Formal Semantics for Recognizing Textual Entailment and Determining Semantic Similarity

Adding Semantics to Data-Driven Paraphrasing

Exploring the boundaries: gene and protein identification in biomedical text

Syntactic features and word similarity for supervised metonymy resolution

Thank you BART! Rewarding Pre-Trained Models Improves Formality Style Transfer

Contact Info

Product

Resources

About