“…For nonconvex optimization, these latter variants exhibit a worst-case O −3/2 complexity order to find an -first-order minimizer compared with the O −2 order of second-order trust-region methods [26], [12,Section 3.2]. Adaptive cubic regularization was later extended to handle inexact derivatives [40,41,2,1], probabilistic models [1,13], and even schemes in which the value of the objective function is never computed [24]. However, as noted in [33], the improvement in complexity has been obtained by trading the simple Newton step requiring only the solution of a single linear system for more complex or slower procedures, such as secular iterations, possibly using Lanczos preprocessing [6,8] (see also [12,Chapters 8 to 10]) or (conjugate-)gradient descent [29,4].…”