Sparse grids, as studied by Zenger and Griebel in the last 10 years have been very successful in the solution of partial differential equations, integral equations and classification problems. Adaptive sparse grid functions are elements of a function space lattice. Such lattices allow the generalisation of sparse grid techniques to the fitting of very high-dimensional functions with categorical and continuous variables. We have observed in first tests that these general adaptive sparse grids allow the identification of the anova structure and thus provide comprehensible models. This is very important for data mining applications. Perhaps the main advantage of these models is that they do not include any spurious interaction terms and thus can deal with very high dimensional data.
Abstract. Finite difference methods, such as the mid-point rule, have been applied successfully to the numerical solution of ordinary and partial differential equations. If such formulas are applied to observational data, in order to determine derivatives, the results can be disastrous. The reason for this is that measurement errors, and even rounding errors in computer approximations, are strongly amplified in the differentiation process, especially if small step-sizes are chosen and higher derivatives are required.A number of authors have examined the use of various forms of averaging which allows the stable computation of low order derivatives from observational data. The size of the averaging set acts like a regularization parameter and has to be chosen as a function of the grid size h.In this paper, it is initially shown how first (and higher) order single-variate numerical differentiation of higher dimensional observational data can be stabilized with a reduced loss of accuracy than occurs for the corresponding differentiation of one-dimensional data. The result is then extended to the multivariate differentiation of higher dimensional data. The nature of the trade-off between convergence and stability is explicitly characterized, and the complexity of various implementations is examined.
Association rules are "if-then rules" with two measures which quantify the support and confidence of the rule for a given data set. Having their origin in market basked analysis, association rules are now one of the most popular tools in data mining. This popularity is to a large part due to the availability of efficient algorithms. The first and arguably most influential algorithm for efficient association rule discovery is Apriori.In the following we will review basic concepts of association rule discovery including support, confidence, the apriori property, constraints and parallel algorithms. The core consists of a review of the most important algorithms for association rule discovery. Some familiarity with concepts like predicates, probability, expectation and random variables is assumed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.