Due to the recent trend towards attribute-based access control (ABAC), several studies have proposed constraints specification languages for ABAC. These formal languages enable security architects to express constraints in a precise mathematical notation. However, since manually formulating constraints involves analyzing multiple natural language policy documents in order to infer constraints-relevant information, constraints specification becomes a repetitive, time-consuming and error-prone task. To bridge the gap between the natural language expression of constraints and formal representations, we propose an automated framework to infer elements forming ABAC constraints from natural language policies. Our proposed approach is built upon recent advancements in natural language processing, specifically, sequence labeling. The experiments, using Bidirectional Long-Short Term Memory (BiLSTM), achieved an F1 score of 0.91 in detecting at least 75% of each constraint expression. The results suggest that the proposed approach holds promise for enabling this automation.
The National Institute of Standards and Technology (NIST) has identified natural language policies as the preferred expression of policy and implicitly called for an automated translation of ABAC natural language access control policy (NLACP) to a machine-readable form. To study the automation process, we consider the hierarchical ABAC model as our reference model since it better reflects the requirements of real-world organizations. Therefore, this paper focuses on the questions of: how can we automatically infer the hierarchical structure of an ABAC model given NLACPs; and, how can we extract and define the set of authorization attributes based on the resulting structure. To address these questions, we propose an approach built upon recent advancements in natural language processing and machine learning techniques. For such a solution, the lack of appropriate data often poses a bottleneck. Therefore, we decouple the primary contributions of this work into: (1) developing a practical framework to extract authorization attributes of hierarchical ABAC system from natural language artifacts, and (2) generating a set of realistic synthetic natural language access control policies (NLACPs) to evaluate the proposed framework. Our experimental results are promising as we achieved -in average -an F1-score of 0.96 when extracting attributes values of subjects, and 0.91 when extracting the values of objects' attributes from natural language access control policies.
Over the last decade, the Short Message Service (SMS) has become a primary communication channel. Nevertheless, its popularity has also given rise to the so-called SMS spam. These messages, i.e., spam, are annoying and potentially malicious by exposing SMS users to credential theft and data loss. To mitigate this persistent threat, we propose a new model for SMS spam detection based on pre-trained Transformers and Ensemble Learning. The proposed model uses a text embedding technique that builds on the recent advancements of the GPT-3 Transformer. This technique provides a high-quality representation that can improve detection results. In addition, we used an Ensemble Learning method where four machine learning models were grouped into one model that performed significantly better than its separate constituent parts. The experimental evaluation of the model was performed using the SMS Spam Collection Dataset. The obtained results showed a state-of-the-art performance that exceeded all previous works with an accuracy that reached 99.91%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.