Are Pretrained Language Models Symbolic Reasoners over Knowledge?

Kassner, Nora; Krojer, Benno; Schütze, Hinrich

doi:10.18653/v1/2020.conll-1.45

Cited by 22 publications

(24 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Petroni et al (2019) demonstrate that language models are able to recall factual knowledge without any fine-tuning and can somewhat function as an unsupervised ODQA system. However, our experiments suggest that, large-scale language models (when fine-tuned to directly answer questions using a set of training QA pairs) struggle to answer questions about low frequency entities and relations, similar to the findings of Kassner et al (2020) and Dufter et al (2021).…”

Section: Model Category Analysissupporting

confidence: 83%

Challenges in Generalization in Open Domain Question Answering

Liu¹,

Lewis²,

Riedel³

et al. 2021

Preprint

View full text Add to dashboard Cite

Recent work on Open Domain Question Answering has shown that there is a large discrepancy in model performance between novel test questions and those that largely overlap with training questions. However, it is as of yet unclear which aspects of novel questions that make them challenging. Drawing upon studies on systematic generalization, we introduce and annotate questions according to three categories that measure different levels and kinds of generalization: training set overlap, compositional generalization (compgen), and novel entity generalization (novelentity). When evaluating six popular parametric and non-parametric models, we find that for the established Natural Questions and Trivi-aQA datasets, even the strongest model performance for comp-gen/novel-entity is 13.1/5.4% and 9.6/1.5% lower compared to that for the full test set -indicating the challenge posed by these types of questions. Furthermore, we show that whilst non-parametric models can handle questions containing novel entities, they struggle with those requiring compositional generalization. Through thorough analysis we find that key question difficulty factors are: cascading errors from the retrieval component, frequency of question pattern, and frequency of the entity.

show abstract

Section: Model Category Analysissupporting

confidence: 83%

Challenges in Generalization in Open Domain Question Answering

Liu¹,

Lewis²,

Riedel³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Recently, pre-trained language models (Peters et al, 2018;Devlin et al, 2019;Brown et al, 2020) have achieved promising performance on many NLP tasks. Apart from utilizing the universal representations from pre-trained models in downstream tasks, some literatures have shown the potential of pretrained masked language models (e.g., BERT (Devlin et al, 2019) and RoBERTa (Liu et al, 2019b)) to be factual knowledge bases (Petroni et al, 2019;Bouraoui et al, 2020;Jiang et al, 2020b;Shin et al, 2020;Jiang et al, 2020a;Wang et al, 2020;Kassner and Schütze, 2020a;Kassner et al, 2020). For example, to extract the birthplace of Steve Jobs, we can query MLMs like BERT with "Steve Jobs was born in [MASK]", where Steve Jobs is the subject of the fact, "was born in" is a prompt string for the relation "place-of-birth" and [MASK] is a placeholder for the object to predict.…”

Section: Introductionmentioning

confidence: 99%

Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases

Cao¹,

Lin²,

Han³

et al. 2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

Previous literatures show that pre-trained masked language models (MLMs) such as BERT can achieve competitive factual knowledge extraction performance on some datasets, indicating that MLMs can potentially be a reliable knowledge source. In this paper, we conduct a rigorous study to explore the underlying predicting mechanisms of MLMs over different extraction paradigms. By investigating the behaviors of MLMs, we find that previous decent performance mainly owes to the biased prompts which overfit dataset artifacts. Furthermore, incorporating illustrative cases and external contexts improve knowledge prediction mainly due to entity type guidance and golden answer leakage. Our findings shed light on the underlying predicting mechanisms of MLMs, and strongly question the previous conclusion that current MLMs can potentially serve as reliable factual knowledge bases 1 .

show abstract

“…generalising an observation about a given concept to hyponyms of that concept. In [58], the ability of transformer based LMs to generalise observed facts is analysed in a systematic way, by training an LM from scratch on a synthetic corpus in which various regularities are present. They find that LMs are indeed capable of discovering symbolic rules, and capable of applying such rules for inferring facts not present in the training corpus, although they also identified important limitations.…”

Section: Contextualised Language Models As Rule-based Reasonersmentioning

confidence: 99%

Modelling Symbolic Knowledge Using Neural Representations

Schockaert

Gutiérrez-Basulto

2022

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Symbolic reasoning and deep learning are two fundamentally different approaches to building AI systems, with complementary strengths and weaknesses. Despite their clear differences, however, the line between these two approaches is increasingly blurry. For instance, the neural language models which are popular in Natural Language Processing are increasingly playing the role of knowledge bases, while neural network learning strategies are being used to learn symbolic knowledge, and to develop strategies for reasoning more flexibly with such knowledge. This blurring of the boundary between symbolic and neural methods offers significant opportunities for developing systems that can combine the flexibility and inductive capabilities of neural networks with the transparency and systematic reasoning abilities of symbolic frameworks. At the same time, there are still many open questions around how such a combination can best be achieved. This paper presents an overview of recent work on the relationship between symbolic knowledge and neural representations, with a focus on the use of neural networks, and vector representations more generally, for encoding knowledge.

show abstract

Are Pretrained Language Models Symbolic Reasoners over Knowledge?

Cited by 22 publications

References 24 publications

Challenges in Generalization in Open Domain Question Answering

Challenges in Generalization in Open Domain Question Answering

Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases

Modelling Symbolic Knowledge Using Neural Representations

Contact Info

Product

Resources

About