2021
DOI: 10.48550/arxiv.2103.05823
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Fast and flexible: Human program induction in abstract reasoning tasks

Abstract: The Abstraction and Reasoning Corpus (ARC) is a challenging program induction dataset that was recently proposed by Chollet ( 2019). Here, we report the first set of results collected from a behavioral study of humans solving a subset of tasks from ARC (40 out of 1000). Although this subset of tasks contains considerable variation, our results showed that humans were able to infer the underlying program and generate the correct test output for a novel test input example, with an average of 80% of tasks solved … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
6
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(9 citation statements)
references
References 13 publications
(16 reference statements)
1
6
0
Order By: Relevance
“…The high frequency of framing tags in the LARC corpus suggests that humans carefully establish context and resolve uncertainty over which programmatic concepts may be relevant: natural programs spend roughly a third of the times identifying which library function to run, and how to parse the current task. This corroborates a key claim in [11], where they state that unlike a typical interpreter that understands only a handful of programmatic concepts, the human interpreter contains a multitude of concepts useful in solving ARC tasks. Thus, if we were to build a rich system capable of solving complex tasks such as ARC with a multitude of concepts, it might be challenging to refer to these concepts via function names in a way similar to computer programs.…”
Section: Programmatic Concepts In Natural Programssupporting
confidence: 80%
See 2 more Smart Citations
“…The high frequency of framing tags in the LARC corpus suggests that humans carefully establish context and resolve uncertainty over which programmatic concepts may be relevant: natural programs spend roughly a third of the times identifying which library function to run, and how to parse the current task. This corroborates a key claim in [11], where they state that unlike a typical interpreter that understands only a handful of programmatic concepts, the human interpreter contains a multitude of concepts useful in solving ARC tasks. Thus, if we were to build a rich system capable of solving complex tasks such as ARC with a multitude of concepts, it might be challenging to refer to these concepts via function names in a way similar to computer programs.…”
Section: Programmatic Concepts In Natural Programssupporting
confidence: 80%
“…If the describer fails the verification task, the submitted natural program is deemed incorrect and discarded. Since we are primarily interested in communicating rather than solving ARC tasks (as opposed to [11]), each describer was shown all previous verified descriptions for a task, allowing the describer to focus on constructing an informative natural program. This results in generations of descriptions, forming a chain of improving natural programs written by a group of humans in collaboration.…”
Section: Two-player Communication Gamementioning
confidence: 99%
See 1 more Smart Citation
“…In contrast, in the recent Abstraction and Reasoning Corpus, the visual stimuli are small, abstract pixel images and, therefore, the domain is also well-defined, but much richer (Chollet, 2019). It thus offers a way to probe a similarly wide variety of visual and abstract concepts as BPs (Johnson, Vong, Lake, & Gureckis, 2021).…”
Section: Discussionmentioning
confidence: 99%
“…One of these tasks, the Abstraction and Reasoning Corpus (ARC) introduced by Chollet (2019), remains an open challenge. ARC tasks are challenging for machines because they require object recognition, abstract reasoning, and procedural analogies (Johnson et al 2021;Acquaviva et al 2022). ARC comprises 1000 unique tasks where each task consists of a small set (typically three) of input-output image pairs for training, and generally one or occasionally multiple test pairs for evaluation (Figure 1).…”
Section: Introductionmentioning
confidence: 99%