“…Besides guiding the practical choice of b, Theorem 3 provides insight into the question of whether simulation-based methods are suited to "break" the curse of dimensionality. This question is the main focus of Rust's analysis of density-based random approximations of the Bellman operator (Rust, 1997). In particular, Rust concludes that if the transition density p(y, | x, a) is known, an algorithm similar to ours can be used to approximate the value function in a computation time that is only polynomial in d. By contrast, Theorem 3 suggests that for kernel-based reinforcement learning, where p(y | x, a) is unknown, the number of observations grows exponentially in d even if b is chosen optimally.…”