On-demand Injection of Lexical Knowledge for Recognising Textual Entailment

Martínez-Gómez, Pascual; Mineshima, Koji; Miyao, Yusuke; Bekki, Daisuke

doi:10.18653/v1/e17-1067

Cited by 28 publications

(32 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Systems: As inference is closely related to logic, there has always been a line of research building logic-based or logic-and-machine-learning hybrid models for NLI/RTE problems (e.g. MacCartney, 2009;Abzianidze, 2015;Martínez-Gómez et al, 2017;Yanaka et al, 2018; Re-implementations of these transformer models for Chinese have led to similar successes on related tasks. For example, Cui et al (2019) report that a large RoBERTa model , pre-trained with whole-word masking, achieves the highest accuracy (81.2%) among their transformer models on XNLI.…”

Section: Related Workmentioning

confidence: 99%

OCNLI: Original Chinese Natural Language Inference

Hu¹,

Richardson²,

Xu³

et al. 2020

Findings of the Association for Computational Linguistics: EMNLP 2020

View full text Add to dashboard Cite

Despite the tremendous recent progress on natural language inference (NLI), driven largely by large-scale investment in new datasets (e.g., SNLI, MNLI) and advances in modeling, most progress has been limited to English due to a lack of reliable datasets for most of the world's languages. In this paper, we present the first large-scale NLI dataset (consisting of ∼56,000 annotated sentence pairs) 1 for Chinese called the Original Chinese Natural Language Inference dataset (OCNLI). Unlike recent attempts at extending NLI to other languages, our dataset does not rely on any automatic translation or non-expert annotation. Instead, we elicit annotations from native speakers specializing in linguistics. We follow closely the annotation protocol used for MNLI, but create new strategies for eliciting diverse hypotheses. We establish several baseline results on our dataset using state-of-the-art pre-trained models for Chinese, and find even the best performing models to be far outpaced by human performance (∼12% absolute performance gap), making it a challenging new resource that we hope will help to accelerate progress in Chinese natural language understanding. To the best of our knowledge, this is the first humanelicited MNLI-style corpus for a non-English language.

show abstract

Section: Related Workmentioning

confidence: 99%

OCNLI: Original Chinese Natural Language Inference

Hu¹,

Richardson²,

Xu³

et al. 2020

Findings of the Association for Computational Linguistics: EMNLP 2020

View full text Add to dashboard Cite

show abstract

“…In ccg2lambda, two wide-coverage CCG parsers, C&C (Clark and Curran, 2007) and Easy-CCG (Lewis and Steedman, 2014), are used for converting tokenized sentences into CCG trees robustly. According to a previous study (Martínez-Gómez et al, 2017), EasyCCG achieves higher accuracy. Thus, when the output of both C&C and EasyCCG can be proved, we use EasyCCG's output for creating features.…”

Section: Related Workmentioning

confidence: 82%

“…The inference system implemented in ccg2lambda using Coq achieves efficient automatic inference by feeding a set of predefined tactics and user-defined proof-search tactics to its interactive mode. The natural deduction system is particularly suitable for injecting external axioms during the theorem-proving process (Martínez-Gómez et al, 2017).…”

Section: System Overviewmentioning

confidence: 99%

Determining Semantic Textual Similarity using Natural Deduction Proofs

Yanaka

Mineshima

Martínez-Gómez

et al. 2017

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Self Cite

View full text Add to dashboard Cite

Determining semantic textual similarity is a core research subject in natural language processing. Since vector-based models for sentence representation often use shallow information, capturing accurate semantics is difficult. By contrast, logical semantic representations capture deeper levels of sentence semantics, but their symbolic nature does not offer graded notions of textual similarity. We propose a method for determining semantic textual similarity by combining shallow features with features extracted from natural deduction proofs of bidirectional entailment relations between sentence pairs. For the natural deduction proofs, we use ccg2lambda, a higherorder automatic inference system, which converts Combinatory Categorial Grammar (CCG) derivation trees into semantic representations and conducts natural deduction proofs. Experiments show that our system was able to outperform other logicbased systems and that features derived from the proofs are effective for learning textual similarity.

show abstract

“…As mentioned earlier, these systems try to prove whether T entails H, by applying a theorem prover to the logical formulas converted from the CCG trees. We report results for ccg2lambda with the default settings (with SPSA abduction; Martínez-Gómez et al (2017)) and results for two versions of Lang-Pro, one which is described in Abzianidze (2015) (henceforth we refer to it as LangPro15) and the other in Abzianidze (2017) (LangPro17). 5 Briefly, the difference between the two versions is that LangPro17 is more robust to parse errors.…”

Section: Experimental Settingsmentioning

confidence: 99%

Consistent CCG Parsing over Multiple Sentences for Improved Logical Reasoning

Yoshikawa

Mineshima

Noji

et al. 2018

Proceedings of the 2018 Conference of the North American Chapter Of the Association for Computational Linguistics: Hu

Self Cite

View full text Add to dashboard Cite

In formal logic-based approaches to Recognizing Textual Entailment (RTE), a Combinatory Categorial Grammar (CCG) parser is used to parse input premises and hypotheses to obtain their logical formulas. Here, it is important that the parser processes the sentences consistently; failing to recognize a similar syntactic structure results in inconsistent predicate argument structures among them, in which case the succeeding theorem proving is doomed to failure. In this work, we present a simple method to extend an existing CCG parser to parse a set of sentences consistently, which is achieved with an inter-sentence modeling with Markov Random Fields (MRF). When combined with existing logic-based systems, our method always shows improvement in the RTE experiments on English and Japanese languages.

show abstract

On-demand Injection of Lexical Knowledge for Recognising Textual Entailment

Cited by 28 publications

References 25 publications

OCNLI: Original Chinese Natural Language Inference

OCNLI: Original Chinese Natural Language Inference

Determining Semantic Textual Similarity using Natural Deduction Proofs

Consistent CCG Parsing over Multiple Sentences for Improved Logical Reasoning

Contact Info

Product

Resources

About