Svetlana Symonenko scite author profile

Liddy

Yılmazel

et al. 2004

Malicious insiders' difficult-to-detect activities pose serious threats to the intelligence community (IC) when these activities go undetected. A novel approach that integrates the results of social network analysis, role-based access monitoring, and semantic analysis of insiders' communications as evidence for evaluation by a risk assessor is being tested on an IC simulation. A semantic analysis, by our proven Natural Language Processing (NLP) system, of the insider's text-based communications produces conceptual representations that are clustered and compared on the expected vs. observed scope. The determined risk level produces an input to a risk analysis algorithm that is merged with outputs from the system's social network analysis and role-based monitoring modules.

Leveraging One-Class SVM and Semantic Analysis to Detect Anomalous Content

Yılmazel

Balasubramanian

et al. 2008

Experiments were conducted to test several hypotheses on methods for improving document classification for the malicious insider threat problem within the Intelligence Community. Bag-of-words (BOW) representations of documents were compared to Natural Language Processing (NLP) based representations in both the typical and one-class classification problems using the Support Vector Machine algorithm. Results show that the NLP features significantly improved classifier performance over the BOW approach both in terms of precision and recall, while using many fewer features. The one-class algorithm using NLP features demonstrated robustness when tested on new domains.

Illuminating trouble tickets with sublanguage theory

Rowe

Liddy

2006

A study was conducted to explore the potential of Natural Language Processing (NLP)based knowledge discovery approaches for the task of representing and exploiting the vital information contained in field service (trouble) tickets for a large utility provider. Analysis of a subset of tickets, guided by sublanguage theory, identified linguistic patterns, which were translated into rule-based algorithms for automatic identification of tickets' discourse structure. The subsequent data mining experiments showed promising results, suggesting that sublanguage is an effective framework for the task of discovering the historical and predictive value of trouble ticket data.

Leveraging One-Class SVM and Semantic Analysis to Detect Anomalous Content

Yılmazel

Balasubramanian

et al. 2005