Data Skeptic

Easily Fooling Deep Neural Networks

Duration:: 28m
Broadcast on:: 16 Jan 2015
Audio Format:: other

My guest this week is Anh Nguyen, a PhD student at the University of Wyoming working in the Evolving AI lab. The episode discusses the paper Deep Neural Networks are Easily Fooled [pdf] by Anh Nguyen, Jason Yosinski, and Jeff Clune. It describes a process for creating images that a trained deep neural network will mis-classify. If you have a deep neural network that has been trained to recognize certain types of objects in images, these "fooling" images can be constructed in a way which the network will mis-classify them. To a human observer, these fooling images often have no resemblance whatsoever to the assigned label. Previous work had shown that some images which appear to be unrecognizable white noise images to us can fool a deep neural network. This paper extends the result showing abstract images of shapes and colors, many of which have form (just not the one the network thinks) can also trick the network.

The Data Skeptic Podcast is a weekly show featuring conversations about skepticism, critical thinking, and data science. So a few quick details before we get started with this episode. I like, presumably every data scientist living in Los Angeles benefit greatly from the many excellent meetups and talks that are around. If you don't already know about data science.la, the website, I presume you don't live in Los Angeles. If you do, you definitely need to go check that out. I was at one such talk earlier this week where I was lucky enough to win a free pass to the upcoming O'Reilly Strata Conference in San Jose. O'Reilly, for your reference, is the company that publishes former guest Matthew Russell's books, including the one we talked about mining the social web. So I'm very appreciative that due to chance I got this free opportunity to go there, so I'm going to definitely include some coverage in upcoming episodes. If you are not yet scheduled to go, but I can talk you into it, I have a coupon code I'm going to give out in a minute. The conference is in San Jose this year in case that matters to anyone. The coupon code will get you 20% off and it's U-G-L-A-2-0, and you put that out during checkout. So if any listeners are planning to attend, I would love, let me know, I'd love to meet up. My guest this week is Aang Nguyen. Aang is a PhD student at the Evolving AI Lab at the University of Wyoming. He was an author on a paper titled Deep Neural Networks Are Easily Fooled, which is very much at the intersection of data science and skepticism. So I was glad he took time out of his data chat with me. I neglected to ask him if he was on Twitter, he's @ANH_NG and the number eight. So welcome to another episode of the Data Skeptic Podcast. I'm joined this week by Aang Nguyen. Thanks for joining me, Aang. My pleasure. Thank you, Carl. So we're going to talk about an interesting paper you wrote on deep learning with some collaborators, but I thought maybe we could start by me asking your background and how you got into this topic. Yes, I've been working with Image Recognition and Artificial Intelligence for the last eight months so far, and this is my very first project, which turns out to be very interesting. Yeah, definitely. A lot of my listeners, I think there's a pretty diverse background. Some people are casual, some people are experts in the field. So I thought it would be helpful if maybe we could give a quick lay person's definition as some of the things that we'll talk about in our conversation. Things like artificial neural networks, and would you mind sharing your perspective at a high level on what these tools are? Yes, the artificial neural networks basically is an abstract model of the animal brain or human brain. Basically, the network composed of neurons and correct connections between neurons to transmit the information, so each neuron is a mathematical function, for example, a sigmoid. Built upon the artificial neural networks, we have deep neural networks or DNNs, which are basically the neural networks, but with more layers, more hidden layers, to solve more practical problems like image recognition or voice recognition or natural language processing. When I think of neural networks, I often think that you have some wide array of inputs. Let's say that's an image that every pixel is a possible input, or that each color in the pixel is an input. So a huge amount of data, and then you want generally one or just a few indicators on the back end, like a classification that says this is an image of a chair, or this is an image of a panda or something like that. And then there's all those neurons you're describing kind of in the middle layers. What's the key reason that people look into doing more than two layers into what we call deep learning? Because practically, they found that doing more than two layers, you actually extract the hierarchical features from an image. For example, if you recognize a dog, you first look at the legs, and then you look at the eyes, and then you combine the features, the information together in the higher layers. And then you say, okay, this is a dog, because we found the legs, we found the eyes, we found the first. So the deep layers or the multiple layers of representations allow the network to build a deep representation of the dog. To my understanding, it tries to model a little bit of the human visual cortex, is that fair? Yes, it's fair enough. It's following the inspire by that concept. Could you tell me a little bit about the typical process for training one of these systems? Let's say, at the cutting edge technology, we use a discriminative model. So for a discriminative training process, with back propagation, basically, we show to the network images and the labels of the images. For example, we have two training sets, the trainings, and the validation sets. Let's say we start with a training set. We take, for example, a batch of 50 images, and one by one, we show it to the network, a fresh network, which has not been trained, and then ask for what it thinks about the image. This is a dog. If it is the correct answer, you move on to the next image. If this is the incorrect answer, then you have to calculate the error that the network makes, and then you propagate the error back, and then change the weights of the network, update the network so that in the next time when you show the same image, it will predict correctly. You keep doing that until the accuracy is at a reasonable rate, and then you stop. That is a typical training process. That makes sense. From my perception, it seems like image recognition, especially in consumer software, has gone from being almost nonexistent a decade ago to being really prevalent. I think everyone's had an experience on Facebook or something like that where the system picks out people's faces and does generally a pretty good job with that. Yes. One might be very hasty in saying that, "Oh, the problem's been solved. See, it's working." But what's your assessment of the current state-of-the-art in image recognition? Are we at the bleeding edge or are we at the very beginning? Actually, we are a pretty good state right now because state-of-the-art image recognition algorithms can recognize faces in handwritten digits. You can give it a mail, and then it can sort a mail for you. However, currently these days, image recognition tools, they are used for much higher dimensional problems, such as photographs, 3D images, videos, which has the time factor in there. Also, you can use image recognition tools for medical scan to look for traces of cancers and all sort of higher-level problems. Interesting. It could speed up a lot of medical exams. If all the images could go into one system and it could take out maybe some of the human error if it could recognize problems or potential problems, which would be a really cool advancement. Exactly. So, your paper is deep neural networks are easily fooled, high-confidence predictions for unrecognizable images. If anyone wants to Google that, and I'll have it in the show notes as well. So, would you mind summarizing what you guys studied? To summarize our study, I'll first touch base to what previously has been shown. In 2013, the intriguing properties of neural networks paper have shown that while DNNs or deep neural networks have shown amazing performance, recognizing real images with high accuracy up to 60%. For example, if you give it an image of a school bus taking at a weird angle, it still can recognize this is a school bus. However, if you modify the images in a way that are imperceptible to humanize, let's say you change a few pixels only in the image and show it to the deep neural networks. It's totally misclassified as, for example, a lion. It will say this is a lion. But to a human, this is the same image. You can't even recognize the difference. Yeah, a lot of the images you guys produced are just white noise to my eyes. So, it's not like a pony gets classified as a donkey where maybe even a human would make that mistake. It's images that the human eye would completely reject. Yes, in our research, we showed that a different phenomenon that the deep neural networks make false positives. We showed that it is easy to produce synthetic images that look totally unrecognizable to humans. But that the deep neural networks believe with near certainty, like 99.99% are like familiar objects like lions or penguins, but they totally look nothing like lions or penguins. So, when I first started your paper and had just kind of gotten through the abstract, I was wondering to myself, you know, maybe this is a case where they haven't trained enough, or there weren't enough layers in the network or not enough neurons that just sort of a few enhancements would solve it. But as I read more and more, that doesn't seem to be the case. So, I was wondering, did you guys look at maybe trying to make richer training examples, or is that just sort of a fool's errand? Is there an inherent problem in the deep learning approach? I would say the answer would be complicated, but if you have more images, more classes, or more constraints to the model, you will be able to get less of these fooling images. But the core idea of the phenomenon is lying in the discriminative nature of the cutting-edge models. Basically, the network is trained to discriminate images, for example, to tell the difference between a table and a chair. But it's not built or trained to recognize this is a chair that has four legs, this is a table that has four legs and a tabletop. No, it's just to differentiate between two classes, for example. So, that the problem lies in right there, the discriminative features or nature of the networks. So, what I'd be correct in saying that sort of the kernel of truth here is that the networks are trying to be able to describe the training examples by whatever features possible. And it might select features that are useful in the training examples, but not necessarily ones that are intuitive to the human eye. Is that a fair assessment? Yes, let's say I give you a pattern of yellow and black alternating lines. What does it look like to you? Yellow and black alternating. Oh, like a B maybe? Yes, like B, or somebody say it's a school bus, something if you give such a pattern to the network, to the deep neural networks, which is trained to discriminate the classes. You will tell this is a school bus or this is a B because I found that pattern, but not because I found the head of the B or I found the antenna of the B's. No, it's because you found that patterns to discriminate between other objects. Can you describe your process for generating the images that fool the networks? We first set our goal, let's say producing images that a network thinks to be a school bus. So first we show a random image to the network and then we ask the network to rate it. For example, it will rate this is a school bus with 1% of confidence. And then we go back to the image and we use optimization algorithms such as numerical optimization like gradient descent or evolutionary algorithms. And then we change image, we modify the image slowly, gradually, bit by bit, and then we go back and ask for the network's opinion again. Let's say the network thinks this is a school bus with 5% of confidence, then we keep doing that. We keep changing the image until the confidence reaches 99%. Interesting. So it's kind of a stochastic search based on the feedback you're getting from the train network. Yes. Very cool. Do you think the the fooling images, are they exploiting specific artifacts of a specific deep learning network? Or do you think that's a it's a general case that you could take a fooling example and show it to lots of different researchers networks and it's likely to fool them all? Yes, the answer would be yes, they generalize. We actually did a test in this research. So we test different artifacts and then the problem is still there. And then we even tested the fooling images on different architectures and they were able to fool different architectures. Yeah, one of the things I found most striking about the paper was the high confidence rates of misclassification. The human eye is far from perfect. So if I saw something and I thought it was a school bus and it was actually a toy, you know, in a dark room of a school bus, that's not the worst misclassification. But some of these things like there's I think a guitar example you have in the paper that looks not even remotely like a guitar. It's actually kind of an if I had to say what it is, it's a nice piece of abstract art, if anything. But it looks nothing like a guitar yet comes up with like 98 or 99% certainty. Yes, we have a network that's very certain of something that we think is very wrong. Does that mean we have a fundamental problem in our approach? I think to answer this question, it's a bit of a long answer. For example, if we tested the images in the validation set, we give like 50,000 images to the network. And for all the images that a network rates correctly as the classes that the image is supposed to be, the confidence turns out to be only 60% median. So that is a relative reference to this. So if we have like a 99% confidence, that is pretty high from the median. And that 60% comes from your cross-validation testing, I presume? Yes, from the validation test. So perhaps one way we could look at this is to say we have a classifier that is 60% accurate, which is pretty good, but certainly not perfect by any standard. So it's going to make some mistakes and some of those, but by definition, and some of those mistakes may or may not be at high confidence levels. Do you think that would be an appropriate way to be skeptical of the results of a deep learning trained example? I think if we have a misunderstanding here, the confidence here is, let's say if you give the network an image of a school bus, and it thinks this is a 60% school bus, and then 10% this is a flat tire, because it's our tire as well. So that is the confidence that we refer in the paper. But the 60% here is the accuracy. Let's say I give you 10 images and if you correctly classify six of them, that is 60% accuracy, but it's not equivalent to 60% confidence. Got it, got it. Yeah. So how much do you think a good training corpus of images affects the end result and the likelihood of a network having a potential weakness like this? In other words, do we kind of blame all the fooling on having a set of images that isn't perhaps robust enough or show all of the possible space of features that might be seen in the future, or is the problem more significant than that? I think the problem is more significant than this, because it is a discriminated feature of the network that causes the problem. If a network is trained to classify, let's say, a husky A type of dog and then husky B type of dog. Let's say husky A has a white fur and husky B has yellow fur, then what you are set out to do is just to classify the fur color. So if I give you a white image, you say, okay, husky A, if I give you a yellow image, you say it is husky B without actually looking for the legs or the head of the dog. It's much more bigger problem than just the training images. In a lot of other machine learning approaches, take something like decision tree learning. Some people will say they like those because there's a human readable component to them. I might disagree because if your decision tree has thousands of nodes, it's very difficult to read that, but at least there's generally something intuitive about it, and people can kind of look at the various decision steps and say, okay, I can see the rationale that's exposed here. But it doesn't seem that you have the same luxury and deep learning because the result of your network is a description of how the neurons are connected and all their weights. And so just knowing where the neurons are and what their weights are doesn't tell you anything about, oh, it's classifying dogs by detecting fur or by detecting color or other things like that. Do you think there's a way around that where we can extract intuition from the networks, or is that just the nature of the problem? You mean to looking into how the network classifies an image of a dog? Exactly. Like, yeah, what features is the network actually using for that classification? Yes, actually, there were a previous study that actually looked at the image of a dog, give it to the network, and let's say the network thinks this is a 99% a dog. Then the author went out to disable a few pixels. Let's say a block, the eye of the dog, and then the network will lower the confidence. So by doing that, you actually can't find out which features the network is looking at to give that classification. Now, interesting. Yeah. Most humans have had an experience where they thought they saw something they didn't, and this is generally a phenomenon we call "paradolia." Are you familiar with that terminal? Yes, "paradolia," yes. Do you think it's a fair analogy to say this is like machine-paradolia, or are these very different phenomenon? We think that this is two separate phenomenons, because "paradolia" to humans, you look at the sky and you say, "Okay, I saw a sheep in the cloud." Because you talk to yourself that I see a sheep, but in the cloud, you still think this is a cloud, but you saw also a sheep shape of a sheep. So you're not actually full at all. You don't think this is a sheep. You think it's a cloud, but in here, the machine thinks this is a sheep, not a cloud. That's an interesting point. Yeah, definitely. In a similar way, there are a lot of evolved images that are in the paper, things like parrots and coffee cups, that both the human and your networks are identifying that way. But we kind of identify them, at least I do, that these are obviously constructed images. They're more like art than a photograph. Yes. Are those the result of that sort of stochastic search process we were discussing earlier? Yes, it was a result of the evolutionary algorithm we use. I think if you started from, let's say, a random image, and it classifies as 1% school bus, and then you start making perturbations and mutations to maximize that confidence. A number of the evolved images start to look like something, whether it be a parrot or a black panther or something like that, to the human eye. Yes. But of course, the starting point, which will probably look more like white noise. So there has to be both a convergence process for the confidence of the network, but also there's probably a convergence of when people start to agree that the images affect simile for what the network is detecting. Do you find that to be the case? Yes, we find that to be the case, yes. So do you think that's something in the nature of images being more abstract that eventually the network can say it's confidently a facsimile, and that's okay? Or is it that the deep neural networks are likely to always miss the subtle difference between facsimile and true image? I think the answer would be when the network starts to see some features, let's say a lack of a doc, and then it will make the decisions. So if you can produce an image that somehow shows a lack of a doc, and then if you copy and paste that feature only around, you actually can raise the confidence of the network. Interesting. If we have good reason to believe that deep neural networks can be fooled, and I think we do from your paper, can we rely on these technologies? If you were in charge of setting up security at a sensitive location, would you trust facial recognition to be a part of the security? I would say a very interesting question. I would say security is all about trade-offs, so it really depends on how sensitive the task is, the location is. If it is like a white house, then probably I would not use it. I myself am very eager to own one of these self-driving cars we're hearing so much about. Do you think I should be afraid that it might think that some dirt on the road is an obstacle and swerve to avoid? Actually, that is a good fear, but we think that this fooling phenomenon would less likely to occur in such a high dimensional space. Because you have a lot of constraints, it is 3D images, and also it is a video, by the way. So you have a lot of constraints, and it's much harder to fool the network to find a fooling example to fool the network. What can be done by you and other future researchers to fix these sorts of issues and prevent deep neural network training cases from being fooled? Yes, we feed the network with fooling images, so that we create a new category called garbage, and we put all the fooling images previously fooled the network into thinking they are familiar objects into this garbage category. And then train the network again, so now the network knows that these images are garbage. The answer actually is complicated. So we tested on two different skills of architectures. On a larger scale architecture, the network actually can recognize this is garbage from the previous round. So the network is not subject to fooling anymore, but for the smaller scale network, actually we actually can keep producing fooling images after even 15 rounds of training. So let me see if I understand correctly, you do a traditional training, and then you'll use the stochastic search to generate some fooling examples, and then these will constitute new test cases for your training with a new categorical label of garbage. So hopefully the system can learn to identify garbage correctly, and improve accuracy for the second round and kind of do that iteratively. And after, I think you said 15 iterations, you're still able to produce fooling images? Yes, after 15 iterations, the network is still fooled by the newly produced images. Wow, is it getting better, or is it converging at all, or do you think this is sort of an unending tunnel to go down? But that result is for this small scale or the MNIST network, but for the large skills after one iteration, the network can already recognize this is garbage, and it's not fool anymore. So there are two results here, it's a bit complicated. What do you think is the distinction? In my mind, I would think smaller areas would have less dimensionality and might be easier, but it sounds like it's the opposite. Yes, it's sort of like the opposite. We still cannot make a conclusion why the problem is happening at the moment. So area of future research? Yes, that's part of our future research, yes. Cool, well that brings me to another good question. What's next for you and your co-authors? Yes, sure, we'll further investigate the fooling problem on different network architectures, especially the generative models, which are the models that are different from the cutting edge models, which are discriminative. So we try to look at this problem from a different perspective and see if the fooling phenomenon still happens on generative models. So before we close, this is a podcast both for data scientists and for skeptics. So I feel compelled to ask your thoughts on the famous face-on-marge image artifact. Are you familiar with that at all? Is that the similar to Pareidolia? It is a little bit, yeah, there's a conspiracy theorist guy who thinks he's found evidence of civilizations on Mars, and one such piece of evidence is that apparently he claims a big monument that's a human face that you can see in some of the, I think it's the Voyager images. Okay, I see. So I know we're getting a bit off topic, but I like to throw a bone to some of my skeptical listeners sometimes. Do you think image recognition researchers should pay attention to negative examples like that? Or are these just interesting curiosities? Actually, it's a valid, it's a valid, it's a valid, a negative example. I think we should pay attention, but, but currently the networks are still fooled by those TV static, which look nothing like a face. So we still have a lot to do before worrying about the networks being fooled by face-on-marge image. Applying genetics algorithm to develop example cases is obviously genetics algorithm is inspired by what biologists have learned about the natural process of evolution, which presumably our eyes went down some path, that very distant species that came before Homo sapiens had eyes that weren't quite as good as ours. Do you think the biological evolution had the same types of fooling problems early on, or is this a specific situation to just machine learning? I think it's, it happens to humans as well, and it happens to animals as well. Actually, a biologist previously have shown that the animals are also susceptible to optical illusions. Oh, interesting. Yes, let's say a lizard, it can turn its skin to green or different colors just to fool the enemies. Ah, yeah, very true. So, yeah, optical illusions happen in nature as well. So, it's interesting that it also happens on machines, yeah. Yeah, absolutely. So, lastly, I like to ask my guests for two recommendations, and they can be anything you like, a book, a research paper, a software package, whatever. The first is the benevolent recommendation, something you appreciate and would like to promote, but don't have any affiliation with. And second is the self-serving recommendation, something ideally you get a direct benefit from through your appearance here. Okay, cool. Yeah, I think my first recommendation would be the CAFE image recognition software package. It has the off-the-shelf deep neural networks that people in the industry or even the enthusiast can use for recognitions off the shelf. And also, it has the online tools to do image recognition as well. You can upload images and then see what the network thinks about that image. Very neat. My second recommendation would be, please go check out our website, which has a lot of interesting publications, which is evolving.org. We do some cool stuff regarding combining evolutionary computation and deep learning. Nice. Yeah, I know we're on an audio podcast, so it's kind of ironic to be talking about images, so I would definitely recommend that everyone go to the website. And I also found a YouTube video I think you guys made on the topic from Evolving AI Lab. So I'll link that in the show notes as well. I think it'd be great for all the listeners to go check out some of the images. And the video is only five minutes long, and it really does an excellent job explaining and giving examples, so I really enjoyed watching it. Yeah, thank you, Kyle. Well, this has been great. Thanks so much for coming on. Yeah, thank you, Kyle. My pleasure. Take care. [music] (upbeat music)