Rebecca Cathey scite author profile

Rebecca Cathey

4Publications

34Citation Statements Received

113Citation Statements Given

How they've been cited

How they cite others

113

Affiliations

BAE Systems (United States), Illinois Institute of Technology, Ionic Systems (United States)

Publications

Order By: Most citations

On the relevance of social media platforms in predicting the volume and patterns of web defacement attacks

Maimon

Fukuda

Hinton³

et al. 2017

View full text Add to dashboard Cite

1-Social media platforms are commonly employed by law enforcement agencies for collecting Open Source Intelligence (OSNIT) on criminals, and assessing the risk they pose to the environment the live in. However, since no prior research has investigated the relationships between hackers' use of social media platforms and their likelihood to generate cyberattacks, this practice is less common among Information Technology Teams. Addressing this empirical gap, we draw on the social learning theory and estimate the relationships between hackers' use of Facebook, Twitter, and YouTube and the frequency of web defacement attacks they generate in different times (weekdays vs. weekends) and against different targets (USA vs. non-USA websites). To answer our research questions, we use hackers' reports of web defacement they generated (available on http://www.zone-h.org), and complement with an independent data collection we launched to identify these hackers' use of different social media platforms. Results from a series of Negative Binomial Regression analyses reveal that hackers' use of social media platforms, and specifically Twitter and Facebook, significantly increases the frequency of web defacement attacks they generate. However, while using these social media platforms significantly increases the volume of web defacement attacks these hackers generate during weekdays, it has no association with the volume of web defacement they launch over weekends. Finally, although hackers' use of both Facebook and Twitter accounts increase the frequency of attacks they generate against non-USA websites, the use of Twitter only increases significantly the volume of web defacement attacks against USA websites.

show abstract

Exploiting parallelism to support scalable hierarchical clustering

Cathey

Jensen

Beitzel

2007

J. Am. Soc. Inf. Sci.

View full text Add to dashboard Cite

A distributed memory parallel version of the group average hierarchical agglomerative clustering algorithm is proposed to enable scaling the document clustering problem to large collections. Using standard message passing operations reduces interprocess communication while maintaining efficient load balancing. In a series of experiments using a subset of a standard Text REtrieval Conference (TREC) test collection, our parallel hierarchical clustering algorithm is shown to be scalable in terms of processors efficiently used and the collection size. Results show that our algorithm performs close to the expected O(n 2 /p) time on p processors rather than the worst-case O(n 3 /p) time. Furthermore, the O(n 2 /p) memory complexity per node allows larger collections to be clustered as the number of nodes increases. While partitioning algorithms such as k-means are trivially parallelizable, our results confirm those of other studies which showed that hierarchical algorithms produce significantly tighter clusters in the document clustering task. Finally, we show how our parallel hierarchical agglomerative clustering algorithm can be used as the clustering subroutine for a parallel version of the buckshot algorithm to cluster the complete TREC collection at near theoretical runtime expectations. IntroductionDocument clustering has long been considered as a means to potentially improve both retrieval effectiveness and efficiency; however, the intensive computation necessary to cluster the entire collection makes its application to large datasets difficult. Accordingly, there is little work on effectively clustering entire large, standard-text collections and less with the intent of using these clusterings to aid retrieval. Rather, much work has focused on either performing simplified clustering algorithms or only using partial clusterings such as clustering only the results for a given query.Clustering algorithms generally consist of a trade-off between accuracy and speed. Hierarchical agglomerative clustering algorithms calculate a full document-to-document similarity matrix. Their clusterings are typically viewed as more accurate than other types of clusterings; however, the computational complexity required for the algorithm's quadratic behavior makes it unrealistic for large document collections. Other clustering algorithms such as the k-means and single pass algorithms iteratively partition the data into clusters. Although these partitioning algorithms run in linear time, the assignment of documents to moving centroids produces different clusterings with each run. Some algorithms combine the accuracy of hierarchical agglomerative algorithms with the speed of partitioning algorithms to get an algorithm that is fast with reasonable accuracy. One such algorithm is the buckshot algorithm, which uses a hierarchical agglomerative algorithm as a clustering subroutine.We propose a hierarchical agglomerative clustering algorithm designed for a distributed memory system in which we use the message passing model to facilitate in...

show abstract

Misuse detection for information retrieval systems

Cathey¹,

Li²,

Goharian³

et al. 2003

View full text Add to dashboard Cite

Optimal content delivery with network coding

Leong

Cathey

2009

View full text Add to dashboard Cite

Abstract-We present a unified linear program formulation for optimal content delivery in content delivery networks (CDNs), taking into account various costs and constraints associated with content dissemination from the origin server to storage nodes, data storage, and the eventual fetching of content from storage nodes by end users. Our formulation can be used to achieve a variety of performance goals and system behavior, including the bounding of fetch delay, load balancing, and robustness against node and arc failures. Simulation results suggest that our formulation performs significantly better than the traditional minimum k-median formulation for the delivery of multiple content, even under modest circumstances (small network, few objects, low storage budget, low dissemination costs).

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Rebecca Cathey

On the relevance of social media platforms in predicting the volume and patterns of web defacement attacks

Exploiting parallelism to support scalable hierarchical clustering

Misuse detection for information retrieval systems

Optimal content delivery with network coding

Contact Info

Product

Resources

About