RAVEN: A Dataset for Relational and Analogical Visual REasoNing

Zhang, Chi; Gao, Feng; Jia, Baoxiong; Zhu, Yixin; Zhu, Song‐Chun

doi:10.1109/cvpr.2019.00546

Cited by 133 publications

(295 citation statements)

References 44 publications

Supporting

Mentioning

285

Contrasting

Order By: Relevance

“…Su et al [13] showed that modifying one pixel only could lead up to 73% adversarial success rate depending on the used images. Recently, there is a growing interest in building neural networks that can learn to reason [76][77][78][79]. Saxon et al [77] demonstrated that current state-of-the-art neural networks show moderate performance in solving basic mathematical problems, the performance deteriorates for questions that require the computation of intermediate values.…”

Section: Related Workmentioning

confidence: 99%

“…The model was able to solve only 14/40 questions from maths exams for 16 year old schoolchildren in the UK. In [78][79] the researchers tested neural networks ability in structural, relational, and analogical reasoning by trying to solve IQ-like visual questions. In particular, they tested the models on the Raven's Progressive Matrices (RPM) dataset, which is correlated with many aspects of reasoning.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Artificial General Intelligence: A New Perspective, with Application to Scientific Discovery

Khalili¹

2022

Preprint

View full text Add to dashboard Cite

The dream of building machines that have human-level intelligence has inspired scientists for decades. Remarkable advances have been made recently; however, we are still far from achieving this goal. In this paper, I propose an alternative perspective on how these machines might be built focusing on the scientific discovery process which represents one of our highest abilities that requires a high level of reasoning and remarkable problem-solving ability. By trying to replicate the procedures followed by many scientists, the basic idea of the proposed approach is to use a set of principles to solve problems and discover new knowledge. These principles are extracted from different historical examples of scientific discoveries. Building machines that fully incorporate these principles in an automated way might open the doors for many advancements.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Artificial General Intelligence: A New Perspective, with Application to Scientific Discovery

Khalili¹

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Recently efforts in this direction are started. In [172], a new dataset is proposed based on Raven's Progressive Matrices (RPM) for the task of visual recognition reasoning, comprising images and related RPM problems, with tree-structured annotations. A counting-based dataset is sampled from the available VQA 2.0 and Visual Genome datasets for the task-specific release [200].…”

Section: Datasets For Validating the Explainability In Multimodmentioning

confidence: 99%

A Review on Explainability in Multimodal Deep Neural Nets

2021

View full text Add to dashboard Cite

Artificial Intelligence techniques powered by deep neural nets have achieved much success in several application domains, most significantly and notably in the Computer Vision applications and Natural Language Processing tasks. Surpassing human-level performance propelled the research in the applications where different modalities amongst language, vision, sensory, text play an important role in accurate predictions and identification. Several multimodal fusion methods employing deep learning models are proposed in the literature. Despite their outstanding performance, the complex, opaque and black-box nature of the deep neural nets limits their social acceptance and usability. This has given rise to the quest for model interpretability and explainability, more so in the complex tasks involving multimodal AI methods. This paper extensively reviews the present literature to present a comprehensive survey and commentary on the explainability in multimodal deep neural nets, especially for the vision and language tasks. Several topics on multimodal AI and its applications for generic domains have been covered in this paper, including the significance, datasets, fundamental building blocks of the methods and techniques, challenges, applications, and future trends in this domain.INDEX TERMS deep multimodal learning, explainable AI, interpretability, survey, trends, vision and language research, XAI.

show abstract

“…More recently, a wave of "data-driven Raven's agents" aims to learn integrated representations of visuospatial domain knowl-edge and problem-solving strategies by training on input-output pairs from a large number of example problems (44)(45)(46)(47)(48)(49).…”

Section: Different Types Of Raven's Problem-solving Agentsmentioning

confidence: 99%

AI, visual imagery, and a case study on the challenges posed by human intelligence tests

Kunda

2020

Proc. Natl. Acad. Sci. U.S.A.

View full text Add to dashboard Cite

Observations abound about the power of visual imagery in human intelligence, from how Nobel prize-winning physicists make their discoveries to how children understand bedtime stories. These observations raise an important question for cognitive science, which is, what are the computations taking place in someone’s mind when they use visual imagery? Answering this question is not easy and will require much continued research across the multiple disciplines of cognitive science. Here, we focus on a related and more circumscribed question from the perspective of artificial intelligence (AI): If you have an intelligent agent that uses visual imagery-based knowledge representations and reasoning operations, then what kinds of problem solving might be possible, and how would such problem solving work? We highlight recent progress in AI toward answering these questions in the domain of visuospatial reasoning, looking at a case study of how imagery-based artificial agents can solve visuospatial intelligence tests. In particular, we first examine several variations of imagery-based knowledge representations and problem-solving strategies that are sufficient for solving problems from the Raven’s Progressive Matrices intelligence test. We then look at how artificial agents, instead of being designed manually by AI researchers, might learn portions of their own knowledge and reasoning procedures from experience, including learning visuospatial domain knowledge, learning and generalizing problem-solving strategies, and learning the actual definition of the task in the first place.

show abstract

RAVEN: A Dataset for Relational and Analogical Visual REasoNing

Cited by 133 publications

References 44 publications

Artificial General Intelligence: A New Perspective, with Application to Scientific Discovery

Artificial General Intelligence: A New Perspective, with Application to Scientific Discovery

A Review on Explainability in Multimodal Deep Neural Nets

AI, visual imagery, and a case study on the challenges posed by human intelligence tests

Contact Info

Product

Resources

About