Selection hyper-heuristics are randomised search methodologies which choose and execute heuristics from a set of low-level heuristics. Recent time complexity analyses for the L O benchmark function have shown that the standard simple random, permutation, random gradient, greedy and reinforcement learning selection mechanisms show no e ects of learning. e idea behind the learning mechanisms is to continue to exploit the currently selected heuristic as long as it is successful. However, the probability that a promising heuristic is successful in the next step is relatively low when perturbing a reasonable solution to a combinatorial optimisation problem. In this paper we generalise the classical selection-perturbation mechanisms so success can be measured over some xed period of length τ , rather than in a single iteration. We present a benchmark function where it is necessary to learn to exploit a particular low-level heuristic, rigorously proving that it makes the di erence between an e cient and an ine cient algorithm. For L O we prove that the generalised random gradient mechanism approaches optimal performance while generalised greedy, although not as fast, still outperforms random local search. An experimental analysis shows that combining the two generalised mechanisms leads to even be er performance.