Mining data streams such as Internet traffic and network security is complex. Due to the difficulty of storage, data streams analytics need to be done in one scan. This limits the time to observe stream feature and hence, further complicates the data mining processes. Traditional supervised data mining with batch training natural is not suitable to mine data streams. This paper proposes an algorithm for online data stream classification and learning with limited labels using selective selftraining semi-supervised classification. The experimental results show it is able to achieve up to 99.6% average accuracy for 10% labeled data and 98.6% average accuracy for 1% labeled data. It can classify up to 34K instances per second.
Peer-to-Peer (P2P) applications are bandwidth-heavy and lead to network congestion. The masquerading nature of P2P traffic makes conventional methods of its identification futile. In order to manage and control P2P traffic efficiently preferably in the network, it is necessary to identify such traffic online and accurately. This paper proposes a technique for online P2P identification based on traffic events signatures. The experimental results show that it is able to identify P2P traffic on the fly with an accuracy of 97.7%, precision of 98% and recall of 99.2%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.