“…Recently, many works have established that with additional assumptions, e.g. low-rankness of the transition, functions approximations for Q-functions, etc, the sample complexity does not depend on |S| [Li et al, 2011, Wen and Van Roy, 2017, Krishnamurthy et al, 2016, Jiang et al, 2017, Dann et al, 2018, Du et al, 2019b, Feng et al, 2020, Du et al, 2019c, Zhong et al, 2019, Jin et al, 2019, Du et al, 2019a, Roy and Dong, 2019, Lattimore and Szepesvari, 2019, Zanette et al, 2020. 5 However, to our knowledge, the sample complexity of all these work scales polynomially with H with the only exceptions to require the transition being deterministic [Wen andVan Roy, 2017, Du et al, 2020].…”