In this paper, we discuss distributed optimization techniques for configuring classifiers in a real-time, informationally-distributed stream mining system. Due to the large volume of streaming data, stream mining systems must often cope with overload, which can lead to poor performance and intolerable processing delay for real-time applications. Furthermore, optimizing over an entire system of classifiers is a difficult task since changing the filtering process at one classifier can impact both the feature values of data arriving at classifiers further downstream and thus, the classification performance achieved by an ensemble of classifiers, as well as the end-to-end processing delay. To address this problem, this paper makes three main contributions: 1) Based on classification and queuing theoretic models, we propose a utility metric that captures both the performance and the delay of a binary filtering classifier system. 2) We introduce a low-complexity framework for estimating the system utility by observing, estimating, and/or exchanging parameters between the inter-related classifiers deployed across the system. 3) We provide distributed algorithms to reconfigure the system, and analyze the algorithms based on their convergence properties, optimality, information exchange overhead, and rate of adaptation to non-stationary data sources. We provide results using different video classifier systems.
Real-time multimedia semantic concept detection requires instant identification of a set of concepts in streaming video or images. However, the potentially high data volumes of multimedia content, and high complexity associated with individual concept detectors, have hindered its practical deployment. In this paper, we present a new online concept detection system deployed on top of a distributed stream mining system. It uses a tree-topology of classifiers that are constructed on a semantic hierarchy of concepts of interest. We introduce a novel methodology for configuring such cascaded classifier topologies under constraints on the available resources. In our approach, we configure individual classifiers with optimized operating points after jointly and explicitly considering the misclassification cost of each end-to-end class of interest in the tree, the system imposed resource constraints, and the confidence level of each object that is classified. We describe the implemented application, system, and optimization algorithms, and verify that significant improvement in terms of accuracy of classification can be achieved through our approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.