1-Social media platforms are commonly employed by law enforcement agencies for collecting Open Source Intelligence (OSNIT) on criminals, and assessing the risk they pose to the environment the live in. However, since no prior research has investigated the relationships between hackers' use of social media platforms and their likelihood to generate cyberattacks, this practice is less common among Information Technology Teams. Addressing this empirical gap, we draw on the social learning theory and estimate the relationships between hackers' use of Facebook, Twitter, and YouTube and the frequency of web defacement attacks they generate in different times (weekdays vs. weekends) and against different targets (USA vs. non-USA websites). To answer our research questions, we use hackers' reports of web defacement they generated (available on http://www.zone-h.org), and complement with an independent data collection we launched to identify these hackers' use of different social media platforms. Results from a series of Negative Binomial Regression analyses reveal that hackers' use of social media platforms, and specifically Twitter and Facebook, significantly increases the frequency of web defacement attacks they generate. However, while using these social media platforms significantly increases the volume of web defacement attacks these hackers generate during weekdays, it has no association with the volume of web defacement they launch over weekends. Finally, although hackers' use of both Facebook and Twitter accounts increase the frequency of attacks they generate against non-USA websites, the use of Twitter only increases significantly the volume of web defacement attacks against USA websites.
The PropBank primarily adds semantic role labels to the syntactic constituents in the parsed trees of the Treebank. The goal is for automatic semantic role labeling to be able to use the domain of locality of a predicate in order to find its arguments. In principle, this is exactly what is wanted, but in practice the PropBank annotators often make choices that do not actually conform to the Treebank parses. As a result, the syntactic features extracted by automatic semantic role labeling systems are often inconsistent and contradictory. This paper discusses in detail the types of mismatches between the syntactic bracketing and the semantic role labeling that can be found, and our plans for reconciling them.
The Proposition Bank (PropBank) project is aimed at creating a corpus of text annotated with information about semantic propositions. The second phase of the project, PropBank II adds additional levels of semantic annotation which include eventuality variables, co-reference, coarse-grained sense tags, and discourse connectives. This paper presents the results of the parallel PropBank II project, which adds these richer layers of semantic annotation to the first 100K of the Chinese Treebank and its English translation. Our preliminary analysis supports the hypothesis that this additional annotation reconciles many of the surface differences between the two languages.
The Termolator is an open-source high-performing terminology extraction system, available on Github. The Termolator combines several different approaches to get superior coverage and precision. The in-line term component identifies potential instances of terminology using a chunking procedure, similar to noun group chunking, but favoring chunks that contain out-of-vocabulary words, nominalizations, technical adjectives, and other specialized word classes. The distributional component ranks such term chunks according to several metrics including: (a) a set of metrics that favors term chunks that are relatively more frequent in a "foreground" corpus about a single topic than they are in a "background" or multi-topic corpus; (b) a well-formedness score based on linguistic features; and (c) a relevance score which measures how often terms appear in articles and patents in a Yahoo web search. We analyse the contributions made by each of these components and show that all modules contribute to the system's performance, both in terms of the number and quality of terms identified. This paper expands upon previous publications about this research and includes descriptions of some of the improvements made since its initial release. This study also includes a comparison with another terminology extraction system available on-line, Termostat (Drouin, 2003). We found that the systems get comparable results when applied to small amounts of data: about 50% precision for a single foreground file (Einstein's Theory of Relativity). However, when running the system with 500 patent files as foreground, Termolator performed significantly better than Termostat. For 500 refrigeration patents, Termolator got 70% precision vs. Termostat's 52%. For 500 semiconductor patents, Termolator got 79% precision vs. Termostat's 51%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.