People generally fail to produce random sequences by overusing alternating patterns and avoiding repeating ones-the gambler's fallacy bias. We can explain the neural basis of this bias in terms of a biologically motivated neural model that learns from errors in predicting what will happen next. Through mere exposure to random sequences over time, the model naturally develops a representation that is biased toward alternation, because of its sensitivity to some surprisingly rich statistical structure that emerges in these random sequences. Furthermore, the model directly produces the best-fitting bias-gain parameter for an existing Bayesian model, by which we obtain an accurate fit to the human data in random sequence production. These results show that our seemingly irrational, biased view of randomness can be understood instead as the perfectly reasonable response of an effective learning mechanism to subtle statistical structure embedded in random sequences.gambler's fallacy | waiting time | neural network | temporal integration | Bayesian inference P eople are prone to search for patterns in sequences of events, even when the sequences are completely random. In a famous game of roulette at the Monte Carlo casino in 1913, black repeated a record 26 times-people began extreme betting on red after about 15 repetitions (1). The gambler's fallacy-a belief that chance is a self-correcting process where a deviation in one direction would induce a deviation in the opposite directionhas been deemed a misperception of random sequences (2). For decades, this fallacy is thought to have originated from the "representativeness bias," in which a sequence of events generated by a random process is expected to represent the essential characteristics of that process even when the sequence is short (3).However, there is a surprising amount of systematic structure lurking within random sequences. For example, in the classic case of tossing a fair coin, where the probability of each outcome (heads or tails) is exactly 0.5 on every single trial, one would naturally assume that there is no possibility for some kind of interesting structure to emerge, given such a simple and desolate form of randomness. And yet, if one records the average amount of time for a pattern to first occur in a sequence (i.e., the waiting time statistic), it is significantly longer for a repetition (head-head HH or tail-tail TT, six tosses) than for an alternation (HT or TH, four tosses). This is despite the fact that on average, repetitions and alternations are equally probable (occurring once in every four tosses, i.e., the same mean time statistic). For both of these facts to be true, it must be that repetitions are more bunched together over time-they come in bursts, with greater spacing between, compared with alternations. Intuitively, this difference comes from the fact that repetitions can build upon each other (e.g., sequence HHH contains two instances of HH), whereas alternations cannot. Statistically, the mean time and waiting time delineate the m...