Is it possible not to cheat on the Turing Test_Exploring the potential and challenges for true natural language 'understanding' by computers

Alberts, Lize

doi:10.48550/arxiv.2206.14672

Cited by 2 publications

(2 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For instance, due to their architecture and training regime, Transformers often fail at simple arithmetic (Floridi andChiriatti 2020, Patel, Bhattamishra andGoyal 2021), arrive at bizarre deductions in scenarios that require real-world knowledge, and sometimes output obvious non sequiturs with sudden and extreme topic shifts that would be absurd coming from a human writer or speaker (Marcus and Davis 2020). Furthermore, given that any "knowledge" about the world that may be encoded in the model is not grounded in experience or reasoning but is filtered through language and its statistical properties (e.g., Alberts 2022), such as the frequent co-occurrence of certain terms, Transformers often resort to heuristics: they produce associatively plausible rather than factually correct answers to information questions (Sobieszek and Price 2022), and to some extent rely on simple lexical overlap between a premise and a hypothesis to predict entailment or non-entailment (McCoy, Pavlick and Linzen 2019).…”

mentioning

confidence: 99%

When Transformer models are more compositional than humans: The case of the depth charge illusion

Paape¹

2023

ELM

View full text Add to dashboard Cite

State-of-the-art Transformer-based language models like GPT-3 are very good at generating syntactically well-formed and semantically plausible text. However, it is unclear to what extent these models encode the compositional rules of human language and to what extent their impressive performance is due to the use of relatively shallow heuristics, which have also been argued to be a factor in human language processing. One example is the so-called depth charge illusion, which occurs when a semantically complex, incongruous sentence like No head injury is too trivial to be ignored is assigned a plausible but not compositionally licensed meaning (Don't ignore head injuries, even if they appear to be trivial). I present an experiment that investigated how depth charge sentences are processed by Transformer models, which are free of many human performance bottlenecks. The results are mixed: Transformers do show evidence of non-compositionality in depth charge contexts, but also appear to be more compositional than humans in some respects.

show abstract

mentioning

confidence: 99%

When Transformer models are more compositional than humans: The case of the depth charge illusion

Paape¹

2023

ELM

View full text Add to dashboard Cite

show abstract

“…For instance, due to their architecture and training regime, Transformers often fail at simple arithmetic (Floridi & Chiriatti 2020, Patel et al 2021), arrive at bizarre deductions in scenarios that require real-world knowledge, and sometimes output obvious non sequiturs with sudden and extreme topic shifts that would be absurd coming from a human writer or speaker (Marcus & Davis 2020). Furthermore, given that any "knowledge" about the world that may be encoded in the model is not grounded in experience or reasoning but is filtered through language and its statistical properties (e.g., Alberts 2022), such as the frequent co-occurrence of certain terms, Transformers often resort to heuristics: they produce associatively plausible rather than factually correct answers to information questions (Sobieszek & Price 2022), and to some extent rely on simple lexical overlap between a premise and a hypothesis to predict entailment or non-entailment (McCoy et al 2019).…”

mentioning

confidence: 99%

When Transformer models are more compositional than humans: The case of the depth charge illusion

Paape¹

2022

Preprint

View full text Add to dashboard Cite

State-of-the-art Transformer-based language models like GPT-3 are very good at generating syntactically well-formed and semantically plausible text. However, it is unclear to what extent these models encode the compositional rules of human language and to what extent their impressive performance is due to the use of relatively shallow heuristics, which have also been argued to be a factor in human language processing. One example is the so-called depth charge illusion, which occurs when a semantically complex, incongruous sentence like "No head injury is too trivial to be ignored" is assigned a plausible but not compositionally licensed meaning ("Don't ignore head injuries, even if they appear to be trivial"). I present an experiment that investigated how depth charge sentences are processed by Transformer models, which are free of many human performance bottlenecks. The results are mixed: Transformers do show evidence of non-compositionality in depth charge contexts, but also appear to be more compositional than humans in some respects.

show abstract

Is it possible not to cheat on the Turing Test_Exploring the potential and challenges for true natural language 'understanding' by computers

Cited by 2 publications

References 0 publications

When Transformer models are more compositional than humans: The case of the depth charge illusion

When Transformer models are more compositional than humans: The case of the depth charge illusion

When Transformer models are more compositional than humans: The case of the depth charge illusion

Contact Info

Product

Resources

About