The use of camouflage is widespread in the biological domain, and has also been used extensively by armed forces around the world in order to make visual detection and classification of objects of military interest more difficult. The recent advent of ever more autonomous military agents raises the questions of whether camouflage can have a similar effect on autonomous agents as it has on human agents, and if so, what kind of camouflage will be effective against such adversaries. In previous works, we have shown that image classifiers based on deep neural networks can be confused by patterns generated by generative adversarial networks (GANs). Specifically, we trained a classifier to distinguish between two ship types, military and civilian. We then used a GAN to generate patterns that, when overlaid on parts of military vessels (frigates), made the classifier confuse the modified frigates with civilian vessels. We termed such patterns "adversarial camouflage" (AC) since these patterns effectively camouflage the frigates with respect to the classifier. The type of adversarial attack described in our previous work is a so-called white box attack. This term describes adversarial attacks that are devised given full knowledge of the classifier under attack. This is as opposed to black box attacks, which describe attacks on unknown classifiers. In our context, the ultimate goal is to design a GAN that is capable of black box attacks, in other words: a GAN that will generate AC that has effect across a wide range of neural network classifiers. In the current work, we study techniques to improve the robustness of our GAN-based approach by investigating whether a GAN can be trained to fool a selection of several neural network-based classifiers, or reduce the confidence of the classifications to a degree which makes them unreliable. Our results indicate that it is indeed possible to weaken a wider range of neural network classifiers by training the generator on several classifiers.
|