Abstract-Discretization, as a preprocessing step for data mining, is a process of converting the continuous attributes of a data set into discrete ones so that they can be treated as the nominal features by machine learning algorithms. Those various discretization methods, that use entropy-based criteria, form a large class of algorithm. However, as a measure of class homogeneity, entropy cannot always accurately reflect the degree of class homogeneity of an interval. Therefore, in this paper, we propose a new measure of class heterogeneity of intervals from the viewpoint of class probability itself. Based on the definition of heterogeneity, we present a new criterion to evaluate a discretization scheme and analyze its property theoretically. Also, a heuristic method is proposed to find the approximate optimal discretization scheme. Finally, our method is compared, in terms of predictive error rate and tree size, with Ent-MDLC, a representative entropy-based discretization method well-known for its good performance. Our method is shown to produce better results than those of Ent-MDLC, although the improvement is not significant. It can be a good alternative to entropy-based discretization methods.
Abstract-Flooding is one of the most fundamental operations in mobile ad hoc networks. Traditional implementation of flooding suffers from the problems of excessive redundancy of messages, resource contention, and signal collision. This causes high protocol overhead and interference to the existing traffic in the networks. Some efficient flooding algorithms were proposed to avoid these problems. However, these algorithms either perform poorly in reducing redundant transmissions, or require each node to maintain 2-hop (or more) neighbors information. In the paper, we study the sufficient and necessary condition of 100% deliverability for flooding schemes that are based on only 1-hop neighbors information. We further propose an efficient flooding algorithm that achieves the local optimality in two senses: 1) the number of forwarding nodes in each step is the minimal; 2) the time complexity for computing forwarding nodes is the lowest, which is O(nlogn), where n is the number of neighbors of a node. Extensive simulations have been conducted and simulation results have shown that performance of our algorithm is significantly better than the existing message efficient flooding methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.