Chia-Wei Hu scite author profile

Chia-Wei Hu

4Publications

78Citation Statements Received

21Citation Statements Given

How they've been cited

165

How they cite others

Affiliations

Columbia University

Publications

Order By: Most citations

Behavior-based modeling and its application to Email analysis

Stolfo

Hershkop

et al. 2006

ACM Trans. Internet Technol.

View full text Add to dashboard Cite

The Email Mining Toolkit (EMT) is a data mining system that computes behavior profiles or models of user email accounts. These models may be used for a multitude of tasks including forensic analyses and detection tasks of value to law enforcement and intelligence agencies, as well for as other typical tasks such as virus and spam detection. To demonstrate the power of the methods, we focus on the application of these models to detect the early onset of a viral propagation without "content-based" (or signature-based) analysis in common use in virus scanners. We present several experiments using real email from 15 users with injected simulated viral emails and describe how the combination of different behavior models improves overall detection rates. The performance results vary depending upon parameter settings, approaching 99% true positive (TP) (percentage of viral emails caught) in general cases and with 0.38% false positive (FP) (percentage of emails with attachments that are mislabeled as viral). The models used for this study are based upon volume and velocity statistics of a user's email rate and an analysis of the user's (social) cliques revealed in the person's email behavior. We show by way of simulation that virus propagations are detectable since viruses may emit emails at rates different than human behavior suggests is normal, and email is directed to groups of recipients in ways that violate the users' typical communications with their social groups.

show abstract

Behavior Profiling of Email

Stolfo

Hershkop

Wang

et al. 2003

View full text Add to dashboard Cite

This paper describes the forensic and intelligence analysis capabilities of the Email Mining Toolkit (EMT) under development at the Columbia Intrusion Detection (IDS) Lab. EMT provides the means of loading, parsing and analyzing email logs, including content, in a wide range of formats. Many tools and techniques have been available from the fields of Information Retrieval (IR) and Natural Language Processing (NLP) for analyzing documents of various sorts, including emails. EMT, however, extends these kinds of analyses with an entirely new set of analyses that model "user behavior". EMT thus models the behavior of individual user email accounts, or groups of accounts, including the "social cliques" revealed by a user's email behavior. The application of this technology to diverse Internet objects and events (e.g., email and web transactions) allows for a broad range of behavior-based analyses including the detection of proxy email accounts and groups of user accounts that communicate with one another including covert group activities. Data mining applies machine learning and statistical techniques to automatically discover and detect misuse patterns, as well as anomalous activities in general. When applied to network-based activities and user account observations for the detection of errant or misuse behavior, these methods are referred to as behavior-based misuse detection. Behavior-based misuse detection can provide important new assistance for counter-terrorism intelligence. In addition to standard Internet misuse detection, these techniques will automatically detect certain patterns across user accounts that are indicative of covert, malicious or counter-intelligence activities. Moreover, behavior-based detection provides workbench functionalities to interactively assist an intelligence agent with targeted investigations and off-line forensics analyses. Intelligence officers have a myriad of tasks and problems confronting them each day. The sheer volume of source materials requires a means of honing in on those sources of maximal value to their mission. A variety of techniques can be applied drawing upon the research and technology developed in the field of Information Retrieval. There is, however, an additional source of information available that can used to aid even the simplest task of rank ordering and sorting documents for inspection: behavior models associated with the documents can be used to identify and group sources in interesting new ways. This is demonstrated by the Email Mining Toolkit that applies a variety of data mining techniques for profiling and behavior modeling of email sources. The deployment of behavior-based techniques for intelligence investigation and tracking tasks represents a significant qualitative step in the counter-intelligence "arms race". Because there is no way to predict what data mining will discover over any given data set, "counter-escalation" is particularly difficult. Behavior-based misuse detection is more robust against standard knowledgebased techniques. Behavior-b...

show abstract

A Behavior-Based Approach to Securing Email Systems

Stolfo

Hershkop

Wang

et al. 2003

View full text Add to dashboard Cite

The Malicious Email Tracking (MET) system, reported in a prior publication, is a behavior-based security system for email services. The Email Mining Toolkit (EMT) presented in this paper is an offline email archive data mining analysis system that is designed to assist computing models of malicious email behavior for deployment in an online MET system. EMT includes a variety of behavior models for email attachments, user accounts and groups of accounts. Each model computed is used to detect anomalous and errant email behaviors. We report on the set of features implemented in the current version of EMT, and describe tests of the system and our plans for extensions to the set of models.

show abstract

Combining Behavior Models to Secure Email Systems

Stolfo¹,

Hu²,

Li³

et al. 2003

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Chia-Wei Hu

Behavior-based modeling and its application to Email analysis

Behavior Profiling of Email

A Behavior-Based Approach to Securing Email Systems

Combining Behavior Models to Secure Email Systems

Contact Info

Product

Resources

About