Since machine learning models have become a valuable asset for companies, watermarking techniques have been developed to protect the intellectual property of these models and prevent model theft. We observe that current watermarking frameworks solely target image classification tasks, neglecting a considerable part of machine learning techniques. In this paper, we propose to address this lack and study the watermarking process of various machine learning techniques such as machine translation, regression, binary image classification and reinforcement learning models. We adapt current definitions to each specific technique and we evaluate the main characteristics of the watermarking process, in particular the robustness of the models against a rational adversary. We show that watermarking models beyond classification is possible while preserving their overall performance. We further investigate various attacks and discuss the importance of the performance metric in the verification process and its impact on the success of the adversary.
With the development of machine learning models for task automation, watermarking appears to be a suitable solution to protect one's own intellectual property. Indeed, by embedding secret specific markers into the model, the model owner is able to analyze the behavior of any model on these markers, called trigger instances and hence claim its ownership if this is the case. However, in the context of a Machine Learning as a Service (MLaaS) platform where models are available for inference, an attacker could forge such proofs in order to steal the ownership of these watermarked models in order to make a profit out of it. This type of attacks, called watermark forging attacks, is a serious threat against the intellectual property of models owners. Current work provides limited solutions to this problem: They constrain model owners to disclose either their models or their trigger set to a third party. In this paper, we propose counter-measures against watermark forging attacks, in a black-box environment and compatible with privacy-preserving machine learning where both the model weights and the inputs could be kept private. We show that our solution successfully prevents two different types of watermark forging attacks with minimalist assumptions regarding either the access to the model's weight or the content of the trigger set.
With the increasing development of machine learning models in daily businesses, a strong need for intellectual property protection arised. For this purpose, current works suggest to leverage backdoor techniques to embed a watermark into the model, by overfitting to a set of particularly crafted and secret input-output pairs called triggers. By sending verification queries containing triggers, the model owner can analyse the behavior of any suspect model on the queries to claim its ownership. However, when it comes to scenarios where frequent monitoring is needed, the computational overhead of these verification queries in terms of volume demonstrates that backdoor-based watermarking appears to be too sensitive to outlier detection attacks and cannot guarantee the secrecy of the triggers.To solve this issue, we introduce BlindSpot, to watermark machine learning models through fairness. Our trigger-less approach is compatible with a high number of verification queries while being robust to outlier detection attacks. We show on Fashion-MNIST and CIFAR-10 datasets that BlindSpot is efficiently watermarking models while robust to outlier detection attacks, at a performance cost on the accuracy of 2%. CCS CONCEPTS• Security and privacy; • Theory of computation → Design and analysis of algorithms;
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.