Richard Chow scite author profile

Detecting inferences in documents is critical for ensuring privacy when sharing information. In this paper, we propose a refined and practical model of inference detection using a reference corpus. Our model is inspired by association rule mining: inferences are based on word co-occurrences. Using the model and taking the Web as the reference corpus, we can find inferences and measure their strength through web-mining algorithms that leverage search engines such as Google or Yahoo!.Our model also includes the important case of private corpora, to model inference detection in enterprise settings in which there is a large private document repository. We find inferences in private corpora by using analogues of our Web-mining algorithms, relying on an index for the corpus rather than a Web search engine.We present results from two experiments. The first experiment demonstrates the performance of our techniques in identifying all the keywords that allow for inference of a particular topic (e.g. "HIV") with confidence above a certain threshold. The second experiment uses the public Enron e-mail dataset. We postulate a sensitive topic and use the Enron corpus and the Web together to find inferences for the topic.These experiments demonstrate that our techniques are practical, and that our model of inference based on word co-occurrence is well-suited to efficient inference detection.

show abstract

Proactive Insider Threat Detection through Graph Learning and Psychological Context

Brdiczka

Liu

Price

et al. 2012

View full text Add to dashboard Cite

Abstract-The annual incidence of insider attacks continues to grow, and there are indications this trend will continue. While there are a number of existing tools that can accurately identify known attacks, these are reactive (as opposed to proactive) in their enforcement, and may be eluded by previously unseen, adversarial behaviors. This paper proposes an approach that combines Structural Anomaly Detection (SA) from social and information networks and Psychological Profiling (PP) of individuals. SA uses technologies including graph analysis, dynamic tracking, and machine learning to detect structural anomalies in large-scale information network data, while PP constructs dynamic psychological profiles from behavioral patterns. Threats are finally identified through a fusion and ranking of outcomes from SA and PP.The proposed approach is illustrated by applying it to a large data set from a massively multi-player online game, World of Warcraft (WoW). The data set contains behavior traces from over 350,000 characters observed over a period of 6 months. SA is used to predict if and when characters quit their guild (a player association with similarities to a club or workgroup in nongaming contexts), possibly causing damage to these social groups. PP serves to estimate the five-factor personality model for all characters. Both threads show good results on the gaming data set and thus validate the proposed approach.

show abstract

Faking contextual data for fun, profit, and privacy

Chow

Golle

2009

View full text Add to dashboard Cite

The amount of contextual data collected, stored, mined, and shared is increasing exponentially. Street cameras, credit card transactions, chat and Twitter logs, e-mail, web site visits, phone logs and recordings, social networking sites, all are examples of data that persists in a manner not under individual control, leading some to declare the death of privacy. We argue here that the ability to generate convincing fake contextual data can be a basic tool in the fight to preserve privacy. One use for the technology is for an individual to make his actual data indistinguishable amongst a pile of false data.In this paper we consider two examples of contextual data, search engine query data and location data. We describe the current state of faking these types of data and our own efforts in this direction.

show abstract

The Rules of Redaction: Identify, Protect, Review (and Repeat)

Bier

Chow

Golle

et al. 2009

IEEE Secur. Privacy Mag.

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Richard Chow

Implicit Authentication through Learning User Behavior

Detecting privacy leaks using corpus-based association rules

Proactive Insider Threat Detection through Graph Learning and Psychological Context

Faking contextual data for fun, profit, and privacy

The Rules of Redaction: Identify, Protect, Review (and Repeat)

Contact Info

Product

Resources

About