Ivan Fursov scite author profile

An adversarial attack paradigm explores various scenarios for vulnerability of machine and especially deep learning models: we can apply minor changes to the model input to force a classifier's failure for a particular example. Most of the state of the art frameworks focus on adversarial attacks for images and other structured model inputs. The adversarial attacks for categorical sequences can also be harmful if they are successful. However, successful attacks for inputs based on categorical sequences should address the following challenges: (1) non-differentiability of the target function, (2) constraints on transformations of initial sequences, and (3) diversity of possible problems. We handle these challenges using two approaches. The first approach adopts Monte-Carlo methods and allows usage in any scenario, the second approach uses a continuous relaxation of models and target metrics, and thus allows using general state of the art methods on adversarial attacks with little additional effort. Results for money transactions, medical fraud, and NLP datasets suggest the proposed methods generate reasonable adversarial sequences that are close to original ones, but fool machine learning models even for blackbox adversarial attacks.

show abstract

Sequence embeddings help to identify fraudulent cases in healthcare insurance

Fursov¹,

Zaytsev²,

Khasyanov³

et al. 2019

Preprint

View full text Add to dashboard Cite

Fraud causes substantial costs and losses for companies and clients in the finance and insurance industries. Examples are fraudulent credit card transactions or fraudulent claims. It has been estimated that roughly 10 percent of the insurance industry's incurred losses and loss adjustment expenses each year stem from fraudulent claims. The rise and proliferation of digitization in finance and insurance has lead to big data sets, consisting in particular of text data, which can be used for fraud detection. In this paper we propose architectures for text embeddings via deep learning, which help to improve the detection of fraudulent claims compared to other machine learning methods. We illustrate our methods using a data set from a large international health insurance company. The empirical results show that our approach outperforms other state-of-the-art methods and can help make the claims management process more efficient. As (unstructured) text data become increaslingly available to economists and econometricians, our proposed methods will be valuable for many similar applications, particularly when variables have a large number of categories as is typical for example of the International Classification of Disease (ICD) codes in health economics and health services.

show abstract

A Differentiable Language Model Adversarial Attack on Text Classifiers

Fursov¹,

Zaytsev²,

Burnyshev³

et al. 2021

Preprint

View full text Add to dashboard Cite

Robustness of huge Transformer-based models for natural language processing is an important issue due to their capabilities and wide adoption. One way to understand and improve robustness of these models is an exploration of an adversarial attack scenario: check if a small perturbation of an input can fool a model.Due to the discrete nature of textual data, gradient-based adversarial methods, widely used in computer vision, are not applicable per se. The standard strategy to overcome this issue is to develop token-level transformations, which do not take the whole sentence into account.In this paper, we propose a new black-box sentence-level attack. Our method fine-tunes a pre-trained language model to generate adversarial examples. A proposed differentiable loss function depends on a substitute classifier score and an approximate edit distance computed via a deep learning model.We show that the proposed attack outperforms competitors on a diverse set of NLP problems for both computed metrics and human evaluation. Moreover, due to the usage of the finetuned language model, the generated adversarial examples are hard to detect, thus current models are not robust. Hence, it is difficult to defend from the proposed attack, which is not the case for other attacks.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ivan Fursov

Adversarial Attacks on Deep Models for Financial Transaction Records

Gradient-based adversarial attacks on categorical sequence models via traversing an embedded world

Sequence embeddings help to identify fraudulent cases in healthcare insurance

A Differentiable Language Model Adversarial Attack on Text Classifiers

Contact Info

Product

Resources

About