Privacy is an important concern for our society where sharing data with partners or releasing data to the public is a frequent occurrence. Some of the techniques that are being used to achieve privacy are to remove identifiers, alter quasi-identifiers, and perturb values. Unfortunately, these approaches suffer from two limitations. First, it has been shown that private information can still be leaked if attackers possess some background knowledge or other information sources. Second, they do not take into account the adverse impact these methods will have on the utility of the released data. In this paper, we propose a method that meets both requirements. Our method, called table-GAN, uses generative adversarial networks (GANs) to synthesize fake tables that are statistically similar to the original table yet do not incur information leakage. We show that the machine learning models trained using our synthetic tables exhibit performance that is similar to that of models trained using the original table for unknown testing cases. We call this property model compatibility. We believe that anonymization/perturbation/synthesis methods without model compatibility are of little value. We used four real-world datasets from four different domains for our experiments and conducted indepth comparisons with state-of-the-art anonymization, perturbation, and generation techniques. Throughout our experiments, only our method consistently shows balance between privacy level and model compatibility.
Abstract. Online content publishers often use catchy headlines for their articles in order to attract users to their websites. These headlines, popularly known as clickbaits, exploit a user's curiosity gap and lure them to click on links that often disappoint them. Existing methods for automatically detecting clickbaits rely on heavy feature engineering and domain knowledge. Here, we introduce a neural network architecture based on Recurrent Neural Networks for detecting clickbaits. Our model relies on distributed word representations learned from a large unannotated corpora, and character embeddings learned via Convolutional Neural Networks. Experimental results on a dataset of news headlines show that our model outperforms existing techniques for clickbait detection with an accuracy of 0.98 with F1-score of 0.98 and ROC-AUC of 0.99.
No abstract
Traffic forecasting is one of the most popular spatio-temporal tasks in the field of machine learning. A prevalent approach in the field is to combine graph convolutional networks and recurrent neural networks for the spatio-temporal processing. There has been fierce competition and many novel methods have been proposed. In this paper, we present the method of spatio-temporal graph neural controlled differential equation (STG-NCDE). Neural controlled differential equations (NCDEs) are a breakthrough concept for processing sequential data. We extend the concept and design two NCDEs: one for the temporal processing and the other for the spatial processing. After that, we combine them into a single framework. We conduct experiments with 6 benchmark datasets and 20 baselines. STG-NCDE shows the best accuracy in all cases, outperforming all those 20 baselines by non-trivial margins.
Knowing if/when a cyber-vulnerability will be exploited and how severe the vulnerability is can help enterprise security officers (ESOs) come up with appropriate patching schedules. Today, this ability is severely compromised: our study of data from Mitre and NIST shows that on average there is a 132 day gap between the announcement of a vulnerability by Mitre and the time NIST provides an analysis with severity score estimates and 8 important severity attributes. Many attacks happen during this very 132-day window. We present Vulnerability Exploit Scoring \& Timing (VEST), a system for (early) prediction and visualization of if/when a vulnerability will be exploited, and its estimated severity attributes and score.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.