Data with low-dimensional nonlinear structure are ubiquitous in engineering and scientific problems. We study a model problem with such structure-a binary classification task that uses a deep fully-connected neural network to classify data drawn from two disjoint smooth curves on the unit sphere. Aside from mild regularity conditions, we place no restrictions on the configuration of the curves. We prove that when (i) the network depth is large relative to certain geometric properties that set the difficulty of the problem and (ii) the network width and number of samples are polynomial in the depth, randomly-initialized gradient descent quickly learns to correctly classify all points on the two curves with high probability. To our knowledge, this is the first generalization guarantee for deep networks with nonlinear data that depends only on intrinsic data properties. Our analysis proceeds by a reduction to dynamics in the neural tangent kernel (NTK) regime, where the network depth plays the role of a fitting resource in solving the classification problem. In particular, via fine-grained control of the decay properties of the NTK, we demonstrate that when the network is sufficiently deep, the NTK can be locally approximated by a translationally invariant operator on the manifolds and stably inverted over smooth functions, which guarantees convergence and generalization.
The use of GriSbner bases as an initial stage in determining the roots of systems of polynomials is well established as a more robust technique than numerical methods. The technique is described as producing 'the exact solution of systems of algebraic equations... '1. However there are some theoretical limitations on the robustness of this method which under certain circumstances leave no confidence in its results. It is not the intention of this paper to provide answers to these problems but rather to consider their nature and the conditions under which they arise.The comments in this paper are targeted specifically at zero dimensional Grgbner bases defined over a rational field. These conditions will be assumed in what follows. Until otherwise stated it will also be assumed that the term ordering is lexicographic and that x < y. This paper is concerned with applications in which the explicit values of roots are required. In many applications real roots are required or more specifically a rational approximation to the real roots which, if irrational, would want to be obtained to any specified degree of precision. In some applications only certain of these roots are sought; those wanted being determined by inequality constraints on the indeterminates. Applications in which algebraic numbers can be maintained throughout the evaluation without error are not considered.In calling this equation solving technique robust what is meant is that an algorithm exists (Buchberger's algorithm 1) that will put a system of polynomials into a form (triangular form) for which an algorithm exists (back substitution) that can determine all the roots to a desired level of accuracy. (Other methods of evaluation will be considered later.) -24 -
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.