Machine learning (ML) algorithms have been shown to be effective in classifying the dynamic internet traffic today. Using additional features and sophisticated ML techniques can improve accuracy and can classify a broad range of application classes. Realizing such classifiers to meet high data rates is challenging. In this paper, we propose two architectures to realize complete online traffic classifier using flowlevel features. First, we develop a traffic classifier based on C4.5 decision tree algorithm and Entropy-MDL discretization algorithm. It achieves an accuracy of 97.92% when classifying a traffic trace consisting of eight application classes. Next, we accelerate our classifier using two architectures on FPGA. One architecture stores the classifier in on-chip distributed RAM. It is designed to sustain a high throughput. The other architecture stores the classifier in block RAM. It is designed to operate with small hardware footprint and thus built at low hardware cost. Experimental results show that our high throughput architecture can sustain a throughput of 550 Gbps assuming 40 Byte packet size. Our low cost architecture demonstrates a 22% better resource efficiency than the high throughput design. It can be easily replicated to achieve 449 Gbps while supporting 160 input traffic streams concurrently. Both architectures are parameterizable and programmable to support any binary-tree-based traffic classifier. We develop a tool which allows users to easily map a binary-tree-based classifier to hardware. The tool takes a classifier as input and automatically generates the Verilog code for the corresponding hardware architecture.
In the context of networking, a heavy hitter is an entity in a data stream whose amount of activity (such as bandwidth consumption or number of connections) is higher than a given threshold. Detecting heavy hitters is a critical task for network management and security in the Internet and data centers. Data streams in modern network usually contain millions of entities, such as traffic flows or IP domains. It is challenging to detect heavy hitters at a high throughput while supporting such a large number of entities. I this work, we propose a high throughput online heavy hitter detector based on the Count-min sketch algorithm on FPGA. We propose a high throughput hash computation architecture, optimize the Count-min sketch for hardwarebased heavy hitter detection and use forwarding to deal with data hazards. The post place-and-route results of our architecture on a state-of-the-art FPGA shows high throughput and scalability. Our architecture achieves a throughput of 114 Gbps while supporting a typical 1 M concurrent entities. It sustains 100+ Gbps throughput while supporting various number of concurrent entities, stream sizes and accuracy requirements. Our implementation demonstrates improved performance compared with other sketch acceleration techniques on various platforms using similar sketch configurations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.