Abstract:Despite the vulnerability of object detectors to adversarial attacks, very few defenses are known to date. While adversarial training can improve the empirical robustness of image classifiers, a direct extension to object detection is very expensive. This work is motivated by recent progress on certified classification by randomized smoothing. We start by presenting a reduction from object detection to a regression problem. Then, to enable certified regression, where standard mean smoothing fails, we propose m… Show more
“…For our certificates, we focus on the 2 adversary described above: the goal of certification is to bound the worst-case decrease in trigger set accuracy, given that the model parameters do not move too far in 2 distance. Doing this directly is in general quite difficult (Katz et al, 2019), but using techniques from (Chiang et al, 2020;Cohen et al, 2019), we show that by adding random noise to the parameters it is possible to define a smoothed version of the model and bound the change in its trigger set accuracy.…”
Section: Watermark Certificationmentioning
confidence: 99%
“…Certified adversarial robustness involves not only training the model to be robust to adversarial attacks under particular threat models, but also proving that no possible attacks under a particular constraint could possibly succeed. Specifically, in this paper, we used the randomized smoothing technique first developed by (Cohen et al, 2019;Lecuyer et al, 2019) for classifiers, and later extended by (Chiang et al, 2020) to deal with regression models. However, as opposed to defending against an 2 -bounded threat models in the image space, we are now defending against an 2 -bounded adversary in the parameter space.…”
Section: Related Workmentioning
confidence: 99%
“…Deriving the certificate Before we start describing the watermark certificate, we will first introduce the percentile smoothed function from (Chiang et al, 2020). Definition 3.1.…”
Section: Watermark Certificationmentioning
confidence: 99%
“…As mentioned in (Chiang et al, 2020), the two forms h p and h p are needed to handle edge cases with discrete distributions. While h p may not admit a closed form, we can approximate it by Monte Carlo sampling (Cohen et al, 2019).…”
Watermarking is a commonly used strategy to protect creators' rights to digital images, videos and audio. Recently, watermarking methods have been extended to deep learning modelsin principle, the watermark should be preserved when an adversary tries to copy the model. However, in practice, watermarks can often be removed by an intelligent adversary. Several papers have proposed watermarking methods that claim to be empirically resistant to different types of removal attacks, but these new techniques often fail in the face of new or better-tuned adversaries. In this paper, we propose a certifiable watermarking method. Using the randomized smoothing technique proposed in Chiang et al., we show that our watermark is guaranteed to be unremovable unless the model parameters are changed by more than a certain 2 threshold. In addition to being certifiable, our watermark is also empirically more robust compared to previous watermarking methods. Our experiments can be reproduced with code at https: //github.com/arpitbansal297/ Certified_Watermarks
“…For our certificates, we focus on the 2 adversary described above: the goal of certification is to bound the worst-case decrease in trigger set accuracy, given that the model parameters do not move too far in 2 distance. Doing this directly is in general quite difficult (Katz et al, 2019), but using techniques from (Chiang et al, 2020;Cohen et al, 2019), we show that by adding random noise to the parameters it is possible to define a smoothed version of the model and bound the change in its trigger set accuracy.…”
Section: Watermark Certificationmentioning
confidence: 99%
“…Certified adversarial robustness involves not only training the model to be robust to adversarial attacks under particular threat models, but also proving that no possible attacks under a particular constraint could possibly succeed. Specifically, in this paper, we used the randomized smoothing technique first developed by (Cohen et al, 2019;Lecuyer et al, 2019) for classifiers, and later extended by (Chiang et al, 2020) to deal with regression models. However, as opposed to defending against an 2 -bounded threat models in the image space, we are now defending against an 2 -bounded adversary in the parameter space.…”
Section: Related Workmentioning
confidence: 99%
“…Deriving the certificate Before we start describing the watermark certificate, we will first introduce the percentile smoothed function from (Chiang et al, 2020). Definition 3.1.…”
Section: Watermark Certificationmentioning
confidence: 99%
“…As mentioned in (Chiang et al, 2020), the two forms h p and h p are needed to handle edge cases with discrete distributions. While h p may not admit a closed form, we can approximate it by Monte Carlo sampling (Cohen et al, 2019).…”
Watermarking is a commonly used strategy to protect creators' rights to digital images, videos and audio. Recently, watermarking methods have been extended to deep learning modelsin principle, the watermark should be preserved when an adversary tries to copy the model. However, in practice, watermarks can often be removed by an intelligent adversary. Several papers have proposed watermarking methods that claim to be empirically resistant to different types of removal attacks, but these new techniques often fail in the face of new or better-tuned adversaries. In this paper, we propose a certifiable watermarking method. Using the randomized smoothing technique proposed in Chiang et al., we show that our watermark is guaranteed to be unremovable unless the model parameters are changed by more than a certain 2 threshold. In addition to being certifiable, our watermark is also empirically more robust compared to previous watermarking methods. Our experiments can be reproduced with code at https: //github.com/arpitbansal297/ Certified_Watermarks
“…In the domain of object detection, most existing defenses focus on global perturbations with a l p norm constraint [8,10,51] and only a few defenses [20,39,48] for patch attacks have been proposed. Saha [39] proposed Grad-defense and OOC defense for defending blindness attacks such that the detector is blind to a specific object category chosen by the adversary.…”
Object detection plays a key role in many security-critical systems. Adversarial patch attacks, which are easy to implement in the physical world, pose a serious threat to stateof-the-art object detectors. Developing reliable defenses for object detectors against patch attacks is critical but severely understudied. In this paper, we propose Segment and Complete defense (SAC), a general framework for defending object detectors against patch attacks through detecting and removing adversarial patches. We first train a patch segmenter that outputs patch masks that provide pixel-level localization of adversarial patches. We then propose a self adversarial training algorithm to robustify the patch segmenter. In addition, we design a robust shape completion algorithm, which is guaranteed to remove the entire patch from the images given the outputs of the patch segmenter are within a certain Hamming distance of the ground-truth patch masks. Our experiments on COCO and xView datasets demonstrate that SAC achieves superior robustness even under strong adaptive attacks with no performance drop on clean images, and generalizes well to unseen patch shapes, attack budgets, and unseen attack methods. Furthermore, we present the APRICOT-Mask dataset, which augments the APRICOT dataset with pixel-level annotations of adversarial patches. We show SAC can significantly reduce the targeted attack success rate of physical patch attacks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.