“…In [70,71], these authors divide the Internet traffic applications into broad classes. For example, the multimedia class contained all streaming applications while the bulk class contained file-transferring applications like FTP.…”
Section: Support Vector Machines (Svms)mentioning
confidence: 99%
“…For example, the multimedia class contained all streaming applications while the bulk class contained file-transferring applications like FTP. Then Li et al [70] use a multi-class SVM to classify incoming flows into one of these classes while Liu et al [71] convert binary classifiers into an online multi-class classifier but did not specify how. Li et al [70] was able to achieve over 99% flow accuracy on the campus trace they collected (or a flow accuracy of 96% when they biased the classifier to have equal mix of all the applications) while the classifier Liu et al [71] used was able to achieve a flow accuracy of approximately 80% on the Auckland IV trace.…”
Classifying Internet traffic flows online into applications or broader classes without inspecting the packet payloads or without relying on port numbers has become a necessity for network operators. The operators can use this information to monitor their networks and provide per-class quality of service. There has been a great deal of research done on Internet traffic classification recently and numerous techniques have been proposed. While the current techniques can obtain a high accuracy classifying Internet traffic, providing performance guarantees for particular classes of interest has never been addressed. In this thesis, we provide two novel types of online Internet traffic classifiers that can provide performance guarantees on the false alarm and false discovery rates, respectively. These guarantees can be for an entire class (class-wise) or between two classes (pair-wise). Controlling false alarm rates is well-suited for application prioritization (i.e. prioritizing time-sensitive applications like VoIP over HTTP) whereas controlling false discovery rates is better suited for blocking or rate-limiting a targeted class of traffic (i.e. Peer-to-Peer). The classifier that provides false alarm rate guarantees is based on a Neyman-Pearson classification framework while the classifier that provides false discovery rate guarantees is based on the Learning to Satisfy (LSAT) framework. Both of these classifiers are implemented using a machine learning technique, namely, the 2-nu Support Vector Machine (SVM). Moreover, all previous work done with these two statistical methodologies focused on binary classification only; we extend these statistical methodologies to a multi-class setting. In addition to the regular application classification problem, we also present preliminary work on a binary LSAT classifier that can detect, after the reception of only a handful of packets, whether a flow will be large, as defined by a network operator. This large flow detector can act as a preprocessor for regular application classifiers. By allowing only large flows to pass to the classifier, this allows the classifier to focus on the more resource-intensive flows. We validated our Internet traffic classifiers by testing our approaches using data provided by an ISP.ii Abrégé Identifier l'application (ou autre classe plus générale) qui génère un flux de trafic Internet, sans compter sur le numéro du port ou inspecter la charge des paquets, est devenu une nécessité pour les opérateurs de réseau. Les opérateurs peuvent utiliser cette information pour surveiller leurs réseaux et fournir une qualité de service propreà chaque classe. Il y a eu beaucoup de travaux de recherche portant sur la classification du trafic Internet effectué récemment et de nombreuses techniques ontété proposées. Bien que les techniques actuelles puissent obtenir une grande précision pour classer le trafic Internet, offrir des garanties de performance pour des catégories particulières est un problème encore inexploré.Dans ce mémoire, nous proposons deu...
“…In [70,71], these authors divide the Internet traffic applications into broad classes. For example, the multimedia class contained all streaming applications while the bulk class contained file-transferring applications like FTP.…”
Section: Support Vector Machines (Svms)mentioning
confidence: 99%
“…For example, the multimedia class contained all streaming applications while the bulk class contained file-transferring applications like FTP. Then Li et al [70] use a multi-class SVM to classify incoming flows into one of these classes while Liu et al [71] convert binary classifiers into an online multi-class classifier but did not specify how. Li et al [70] was able to achieve over 99% flow accuracy on the campus trace they collected (or a flow accuracy of 96% when they biased the classifier to have equal mix of all the applications) while the classifier Liu et al [71] used was able to achieve a flow accuracy of approximately 80% on the Auckland IV trace.…”
Classifying Internet traffic flows online into applications or broader classes without inspecting the packet payloads or without relying on port numbers has become a necessity for network operators. The operators can use this information to monitor their networks and provide per-class quality of service. There has been a great deal of research done on Internet traffic classification recently and numerous techniques have been proposed. While the current techniques can obtain a high accuracy classifying Internet traffic, providing performance guarantees for particular classes of interest has never been addressed. In this thesis, we provide two novel types of online Internet traffic classifiers that can provide performance guarantees on the false alarm and false discovery rates, respectively. These guarantees can be for an entire class (class-wise) or between two classes (pair-wise). Controlling false alarm rates is well-suited for application prioritization (i.e. prioritizing time-sensitive applications like VoIP over HTTP) whereas controlling false discovery rates is better suited for blocking or rate-limiting a targeted class of traffic (i.e. Peer-to-Peer). The classifier that provides false alarm rate guarantees is based on a Neyman-Pearson classification framework while the classifier that provides false discovery rate guarantees is based on the Learning to Satisfy (LSAT) framework. Both of these classifiers are implemented using a machine learning technique, namely, the 2-nu Support Vector Machine (SVM). Moreover, all previous work done with these two statistical methodologies focused on binary classification only; we extend these statistical methodologies to a multi-class setting. In addition to the regular application classification problem, we also present preliminary work on a binary LSAT classifier that can detect, after the reception of only a handful of packets, whether a flow will be large, as defined by a network operator. This large flow detector can act as a preprocessor for regular application classifiers. By allowing only large flows to pass to the classifier, this allows the classifier to focus on the more resource-intensive flows. We validated our Internet traffic classifiers by testing our approaches using data provided by an ISP.ii Abrégé Identifier l'application (ou autre classe plus générale) qui génère un flux de trafic Internet, sans compter sur le numéro du port ou inspecter la charge des paquets, est devenu une nécessité pour les opérateurs de réseau. Les opérateurs peuvent utiliser cette information pour surveiller leurs réseaux et fournir une qualité de service propreà chaque classe. Il y a eu beaucoup de travaux de recherche portant sur la classification du trafic Internet effectué récemment et de nombreuses techniques ontété proposées. Bien que les techniques actuelles puissent obtenir une grande précision pour classer le trafic Internet, offrir des garanties de performance pour des catégories particulières est un problème encore inexploré.Dans ce mémoire, nous proposons deu...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.