Abstract. In this paper, we propose a novel online classifier for complex data streams which are generated from non-stationary stochastic properties. Instead of using a single training model and counters to keep important data statistics, the introduced online classifier scheme provides a real-time self-adjusting learning model. The learning model utilizes the multiplication-based update algorithm of the Stochastic Learning Weak Estimator (SLWE) at each time instant as a new labeled instance arrives. In this way, the data statistics are updated every time a new element is inserted, without requiring that we have to rebuild its model when changes occur in the data distributions. Finally, and most importantly, the model operates with the understanding that the correct classes of previously-classified patterns become available at a later juncture subsequent to some time instances, thus requiring us to update the training set and the training model. The results obtained from rigorous empirical analysis on multinomial distributions, is remarkable. Indeed, it demonstrates the applicability of our method on synthetic datasets, and proves the advantages of the introduced scheme.
Classification, typically, deals with unique and distinct training and testing phases. This paper pioneers the concept when these phases are not so clearly well-defined. More specifically, we consider the case where the testing patterns can subsequently be considered as training patterns. The paradigm is further complicated because we assume that the class-conditional distributions of the features/classes are non-stationary, as in the case of most real-world applications. Specifically, we consider the model where the training phase is nonstationary and that it is, further, interleaved with the testing, and where it can be done online and in a real-time manner.We propose a novel online classifier for complex data streams which are generated from non-stationary stochastic properties. Instead of using a single training model with ''counters'' that maintain important data statistics, our online classifier scheme provides a real-time self-adjusting learning model. The learning model utilizes the multiplication-based update algorithm of the Stochastic Learning Weak Estimator (SLWE) at each time instant as a new labeled instance arrives. In this way, the data statistics are updated every time a new element is seen, without requiring that we have to rebuild the model when changes occur in the data distributions. Finally, and most importantly, the model operates with the understanding that the correct classes of previously-classified patterns become available at a later juncture subsequent to some time instances. This forces us to update the training set, the training model and the class conditional distributions as the testing proceeds.The results from rigorous empirical analysis on two-dimensional/multi-dimensional and binomial/ multinomial distributions are remarkable. We also report some results on two real-life datasets adapted to this model of computation, demonstrating the advantages of the novel scheme for both binomial and multinomial non-stationary distributions.
Classification is a well-known problem in Pattern Recognition that has been extensively studied for decades. The classification process involves assigning a class label to an unlabeled element based on an available training sample. A common assumption in the majority of existing classification algorithms is that the stochastic distribution of the data being classified is stationary and does not change with time.However, in some real-word domains the data distribution can be non-stationary, implying that the distribution or characterizing aspects of the features change over time or the data generation phenomenon itself may change over time, which, in turn, leads to a variation in the data distribution.In this thesis, we consider a problem of C-class classification and of detecting the source of data in periodic non-stationary environments. Within our model, sequential patterns arrive and are processed in the form of a data stream that was generated from different sources with distinct statistical distributions. Using a family of Stochastic-Learning based Weak Estimators, we adopt a scheme to estimate the vector of the probability distribution of the binomial/multinomial datasets. We also utilize the multiplication-based update algorithm, in order to provide a self-adjusting learning scheme to adapt the model to any abrupt changes occurring in the environment.In this thesis we consider two different classification scenarios. First we study a scenario in which the stream of data was generated from more than two sources, each with their own fixed stochastic properties. We then proposed a novel online classifier for more complex data streams which are generated from non-stationary stochastic properties. An empirical analysis on synthetic datasets demonstrates the advantages of the introduced scheme for both the binomial and multinomial nonstationary distributions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.