“…As discussed by Russakovsky et al [22], the 1,000 detailed classes of ILSVRC (e.g., coucal, sealyham terrier) require expert training for humans to perform adequately at classification, not only to visually distinguish the objects, but also to retrieve their label from the 1,000 options. Thus, along similar lines to experiments conducted by Geirhos et al [7] and Zhang et al [31], we frame the task for the following twenty high-level classes: bear, bird, cat, dog, fish, flower, fox, fruit, fungus, hippopotamus, insect, lion, monkey, reptile, shark, spider, tiger, vegetable, vehicle, wolf. We select these classes as they should be generally recognisable to humans without prior training; furthermore, we select mostly plants and animals to provide a more challenging classification task, with visually similar classes, such as lion/tiger, fruit/vegetable, dog/wolf, insect/spider, etc., providing non-trivial cases to visually distinguish.…”