Abstract-The Neyman-Pearson (NP) approach to hypothesis testing is useful in situations where different types of error have different consequences or a priori probabilities are unknown. For any 0, the NP lemma specifies the most powerful test of size , but assumes the distributions for each hypothesis are known or (in some cases) the likelihood ratio is monotonic in an unknown parameter. This paper investigates an extension of NP theory to situations in which one has no knowledge of the underlying distributions except for a collection of independent and identically distributed (i.i.d.) training examples from each hypothesis. Building on a "fundamental lemma" of Cannon et al., we demonstrate that several concepts from statistical learning theory have counterparts in the NP context. Specifically, we consider constrained versions of empirical risk minimization (NP-ERM) and structural risk minimization (NP-SRM), and prove performance guarantees for both. General conditions are given under which NP-SRM leads to strong universal consistency. We also apply NP-SRM to (dyadic) decision trees to derive rates of convergence. Finally, we present explicit algorithms to implement NP-SRM for histograms and dyadic decision trees.Index Terms-Generalization error bounds, Neyman-Pearson (NP) classification, statistical learning theory.
Decision trees are among the most popular types of classifiers, with interpretability and ease of implementation being among their chief attributes. Despite the widespread use of decision trees, theoretical analysis of their performance has only begun to emerge in recent years. In this paper it is shown that a new family of decision trees, dyadic decision trees (DDTs), attain nearly optimal (in a minimax sense) rates of convergence for a broad range of classification problems. Furthermore, DDTs are surprisingly adaptive in three important respects: They automatically (1) adapt to favorable conditions near the Bayes decision boundary; (2) focus on data distributed on lower dimensional manifolds; and (3) reject irrelevant features. DDTs are constructed by penalized empirical risk minimization using a new data-dependent penalty and may be computed exactly with computational complexity that is nearly linear in the training sample size. DDTs are the first classifier known to achieve nearly optimal rates for the diverse class of distributions studied here while also being practical and implementable. This is also the first study (of which we are aware) to consider rates for adaptation to intrinsic data dimension and relevant features.C. Scott is with the
Abstract-A common approach to determining corresponding points on two shapes is to compute the cost of each possible pairing of points and solve the assignment problem (weighted bipartite matching) for the resulting cost matrix. We consider the problem of solving for point correspondences when the shapes of interest are each defined by a single, closed contour. A modification of the standard assignment problem is proposed whereby the correspondences are required to preserve the ordering of the points induced from the shapes' contours. Enforcement of this constraint leads to significantly improved correspondences. Robustness with respect to outliers and shape irregularity is obtained by required only a fraction of feature points to be matched. Furthermore, the minimum matching size may be specified in advance. We present efficient dynamic programming algorithms to solve the proposed optimization problem. Experiments on the Brown and MPEG-7 shape databases demonstrate the effectiveness of the proposed method relative to the standard assignment problem.Index Terms-Assignment problem, contour matching, dynamic programming, MPEG-7 shape database, shape descriptors.
Surrogate losses underlie numerous state-of-the-art binary classification algorithms, such as support vector machines and boosting. The impact of a surrogate loss on the statistical performance of an algorithm is well-understood in symmetric classification settings, where the misclassification costs are equal and the loss is a margin loss. In particular, classification-calibrated losses are known to imply desirable properties such as consistency. While numerous efforts have been made to extend surrogate loss-based algorithms to asymmetric settings, to deal with unequal misclassification costs or training data imbalance, considerably less attention has been paid to whether the modified loss is still calibrated in some sense. This article extends the theory of classification-calibrated losses to asymmetric problems. As in the symmetric case, it is shown that calibrated asymmetric surrogate losses give rise to excess risk bounds, which control the expected misclassification cost in terms of the excess surrogate risk. This theory is illustrated on the class of uneven margin losses, and the uneven hinge, squared error, exponential, and sigmoid losses are treated in detail.
Stored ICD EGMs usually are an accurate surrogate for 12-lead ECGs for differentiating clinical VTs from other VTs. Pace mapping based on ICD EGMs has variable resolution but may be useful for identifying a VT exit site.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.