From foraging for food to learning complex games, many aspects of human behaviour can be framed as a search problem with a vast space of possible actions. Under finite search horizons, optimal solutions are generally unobtainable. Yet how do humans navigate vast problem spaces, which require intelligent exploration of unobserved actions? Using a variety of bandit tasks with up to 121 arms, we study how humans search for rewards under limited search horizons, where the spatial correlation of rewards (in both generated and natural environments) provides traction for generalization. Across a variety of different probabilistic and heuristic models, we find evidence that Gaussian Process function learning-combined with an optimistic Upper Confidence Bound sampling strategy-provides a robust account of how people use generalization to guide search. Our modelling results and parameter estimates are recoverable, and can be used to simulate human-like performance, providing insights about human behaviour in complex environments.All rights reserved. No reuse allowed without permission.was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.The copyright holder for this preprint (which . http://dx.doi.org/10.1101/171371 doi: bioRxiv preprint first posted online Aug. 1, 2017; previous work exploring inductive biases in pure function learning contexts 21,22 and human behaviour in univariate function optimization 23 , we present a comprehensive approach using a robust computational modelling framework to understand how humans generalize in an active search task.Across three studies using uni-and bivariate multi-armed bandits with up to 121 arms, we compare a diverse set of computational models in their ability to predict individual human behaviour. In all experiments, the majority of subjects are best captured by a model combining function learning using Gaussian Process (GP) regression, with an optimistic Upper Confidence Bound (UCB) sampling strategy that directly balances expectations of reward with the reduction of uncertainty. Importantly, we recover meaningful and robust estimates about the nature of human generalization, showing the limits of traditional models of associative learning 24 in tasks where the environmental structure supports learning and inference.The main contributions of this paper are threefold:1. We introduce the spatially correlated multi-armed bandit as a paradigm for studying how people use generalization to guide search in larger problems space than traditionally used for studying human behaviour.2. We find that a Gaussian Process model of function learning robustly captures how humans generalize and learn about the structure of the environment, where an observed tendency towards undergeneralization is shown to sometimes be beneficial.3. We show that participants solve the exploration-exploitation dilemma by optimistically inflating expectations of reward by the underlying uncertainty, with recoverable evidence for the separate phenome...
How do children and adults differ in their search for rewards? We considered three different hypotheses that attribute developmental differences to (a) children’s increased random sampling, (b) more directed exploration toward uncertain options, or (c) narrower generalization. Using a search task in which noisy rewards were spatially correlated on a grid, we compared the ability of 55 younger children (ages 7 and 8 years), 55 older children (ages 9–11 years), and 50 adults (ages 19–55 years) to successfully generalize about unobserved outcomes and balance the exploration–exploitation dilemma. Our results show that children explore more eagerly than adults but obtain lower rewards. We built a predictive model of search to disentangle the unique contributions of the three hypotheses of developmental differences and found robust and recoverable parameter estimates indicating that children generalize less and rely on directed exploration more than adults. We did not, however, find reliable differences in terms of random sampling.
While the influence of presentation formats have been widely studied in Bayesian reasoning tasks, we present the first systematic investigation of how presentation formats influence information search decisions. Four experiments were conducted across different probabilistic environments, where subjects (N = 2,858) chose between 2 possible search queries, each with binary probabilistic outcomes, with the goal of maximizing classification accuracy. We studied 14 different numerical and visual formats for presenting information about the search environment, constructed across 6 design features that have been prominently related to improvements in Bayesian reasoning accuracy (natural frequencies, posteriors, complement, spatial extent, countability, and part-to-whole information). The posterior variants of the icon array and bar graph formats led to the highest proportion of correct responses, and were substantially better than the standard probability format. Results suggest that presenting information in terms of posterior probabilities and visualizing natural frequencies using spatial extent (a perceptual feature) were especially helpful in guiding search decisions, although environments with a mixture of probabilistic and certain outcomes were challenging across all formats. Subjects who made more accurate probability judgments did not perform better on the search task, suggesting that simple decision heuristics may be used to make search decisions without explicitly applying Bayesian inference to compute probabilities. We propose a new take-the-difference (TTD) heuristic that identifies the accuracy-maximizing query without explicit computation of posterior probabilities. (PsycINFO Database Record
Children are natural born explorers. While exploration and active learning are quintessential features of development and maturation, they also pose fundamental challenges to children and adults alike. In particular, efficiently searching for information and rewards requires balancing the dual goals of exploring unknown options to learn something new, and exploiting familiar options to obtain known rewards. At a restaurant, should you go with your usual favorite or should you try the chef's latest creation? As a child, should you play your favorite game again or try out something new? Exploring novel options can potentially reveal new and even better rewards, but could also lead to disappointment. Known as the explore-exploit dilemma, this
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.