OOD-CV: A Benchmark for Robustness to Out-of-Distribution Shifts of Individual Nuisances in Natural Images

Zhao, Bingchen; Yu, Shaozuo; Ma, Wufei; Yu, Mingxin; Wang, Angtian; He, Jinrong; Yuille, Alan; Kortylewski, Adam

doi:10.1007/978-3-031-20074-8_10

Cited by 15 publications

(3 citation statements)

References 42 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…They show feasibility of this attack on MiniGPT-4 [17], Fuyu [40], LLaVA [16]. This attack is evaluated on Safety Evaluation Benchmark, focusing on two evaluation scenarios: (a) OODCV-VQA and Counterfactual Variant: A novel VQA dataset is proposed grounded on images from OODCV [41]. The dataset includes questions with pre-defined templates for yes/no or digit responses.…”

Section: Red Teaming Methodsmentioning

confidence: 99%

Red Teaming for Multimodal Large Language Models: A Survey

Mahato,

Kumar,

Singh

et al. 2024

Preprint

View full text Add to dashboard Cite

show abstract

Section: Red Teaming Methodsmentioning

confidence: 99%

Red Teaming for Multimodal Large Language Models: A Survey

Mahato,

Kumar,

Singh

et al. 2024

Preprint

View full text Add to dashboard Cite

show abstract

“…Regarding natural inputs, several methods have been proposed for defining perturbations and generating perturbed input images, guided by knowledge of the application domain [58], coverage metrics [59,60], or properties that the system must satisfy, such as metamorphic relations [22,61]. In addition, several perturbation benchmarks have been proposed in the literature to assess robustness [7,[62][63][64]. However, few of these methods have been developed and evaluated in the context of 2D object detection systems.…”

Section: Robustness Testing Of Ai-based Object Detection Systemsmentioning

confidence: 99%

“…It aims to verify that inserting objects into the background of an image does not change the results for the other objects. For their part, Zhao et al [63] proposed a natural perturbation benchmark for testing models in the field of computer vision, including object detection models.…”

Section: Robustness Testing Of Ai-based Object Detection Systemsmentioning

confidence: 99%

Robustness Assessment of AI-Based 2D Object Detection Systems: A Method and Lessons Learned from Two Industrial Cases

Wozniak,

Segura,

Mazo

2024

Electronics

View full text Add to dashboard Cite

The reliability of AI-based object detection models has gained interest with their increasing use in safety-critical systems and the development of new regulations on artificial intelligence. To meet the need for robustness evaluation, several authors have proposed methods for testing these models. However, applying these methods in industrial settings can be difficult, and several challenges have been identified in practice in the design and execution of tests. There is, therefore, a need for clear guidelines for practitioners. In this paper, we propose a method and guidelines for assessing the robustness of AI-based 2D object detection systems, based on the Goal Question Metric approach. The method defines the overall robustness testing process and a set of recommended metrics to be used at each stage of the process. We developed and evaluated the method through action research cycles, based on two industrial cases and feedback from practitioners. Thus, the resulting method addresses issues encountered in practice. A qualitative evaluation of the method by practitioners was also conducted to provide insights that can guide future research on the subject.

show abstract