The bias/variance tradeoff is fundamental to learning: increasing a model's complexity can improve its fit on training data, but potentially worsens performance on future samples [1]. Remarkably, however, the human brain effortlessly handles a wide-range of complex pattern recognition tasks. On the basis of these conflicting observations, it has been argued that useful biases in the form of "generic mechanisms for representation" must be hardwired into cortex [2]. This note describes a useful bias that encourages cooperative learning which is both biologically plausible and rigorously justified [3][4][5][6][7][8][9].Let us outline the problem. Neurons learn inductively. They generalize from finite samples and encode estimates of future outcomes (for example, rewards) into their spiketrains [10]. Results from learning theory imply that generalizing successfully requires strong biases [1] or, in other words, specialization. Thus, at any given time some neurons' specialties are more relevant than others. Since most of the data neurons receive are other neurons' outputs, it is essential that neurons indicate which of their outputs encode high quality estimates. Downstream neurons should then be biased to specialize on these outputs.The proposed biasing mechanism is based on a constraint on the effective information, ei, generated by spikes, see Eq. (*) below. The motivation for using effective information comes from a connection to learning theory explained in §2. There, we show the ei generated by empirical risk minimization quantifies capacity: higher ei yields tighter generalization bounds.Sections §3 and §4 consider implications of the constraint in two cases: abstractly and for a concrete model. In both cases we find that imposing constraint (*) implies: (i) essentially all information is carried by spikes; (ii) spikes encode reward estimates and (iii) the higher the effective information, the better the guarantees on estimates.Although the proposal is inspired by cortical learning, the main ideas are information-theoretic, suggesting they may also apply to other examples of interacting populations of adaptive agents.