These networks have to be trained, and that's typically done by showing them lots of images of the things they're being trained to recognize. That works to some extent, but the researchers want to know what's going on at each step of the process so that they could optimize the way the programming works.
We train an artificial neural network by showing it millions of training examples and gradually adjusting the network parameters until it gives the classifications we want. The network typically consists of 10-30 stacked layers of artificial neurons. Each image is fed into the input layer, which then talks to the next layer, until eventually the “output” layer is reached. The network’s “answer” comes from this final output layer.It turns out that if they have the knowledge to recognize something in an image, they have most of the knowledge to create images from nothing.
One of the challenges of neural networks is understanding what exactly goes on at each layer. We know that after training, each layer progressively extracts higher and higher-level features of the image, until the final layer essentially makes a decision on what the image shows. For example, the first layer maybe looks for edges or corners. Intermediate layers interpret the basic features to look for overall shapes or components, like a door or a leaf. The final few layers assemble those into complete interpretations—these neurons activate in response to very complex things such as entire buildings or trees.
“One way to visualise what goes on is to turn the network upside down and ask it to enhance an input image in such a way as to elicit a particular interpretation,” they add. “Say you want to know what sort of image would result in ‘banana’. Start with an image full of random noise, then gradually tweak the image towards what the neural net considers a banana.”In other words, they generate a set of random pixels and feed it into the algorithm. The output has some features that are more reminiscent of bananas than before so they use that as the input and run the algorithm again. Eventually, they get something like this:
That doesn't look like a photo of bananas, but it's more like bananas than the original input. In one interesting example they asked the network to find weight lifters' dumbbells. The resulting image always had portions of an arm holding the dumbbell! The network learned to recognize them, but so many of the training photos had a muscular arm holding the dumbbell, it assumed the arm was part of the dumbbell. The solution might be to just give it many more images that don't include an arm, so that it learns the arm isn't part of it.
The surprise to me was the stunning, beautiful abstract art the networks are creating. The Google labs team says,
If we apply the algorithm iteratively on its own outputs and apply some zooming after each iteration, we get an endless stream of new impressions, exploring the set of things the network knows about. We can even start this process from a random-noise image, so that the result becomes purely the result of the neural network...If the network is being trained to recognize buildings, and random noise is fed in, with some zooming and selecting of areas to work on, you eventually come up fantasy-impressionistic landscapes like this one
Fountains, pagodas, arches, aqueducts, and ponds with fountains - or icebergs. It's reminiscent of an album cover from the days of widespread hallucinogens. More incredible eye candy at this gallery.
As the Guardian says,
Androids don’t just dream of electric sheep; they also dream of mesmerising, multicoloured landscapes.
This is absolutely the freakiest thing I've read in a loooong time.
ReplyDeleteSome of those are strangely beautiful, and others look like.....well, I don't know what.
ReplyDeleteAnon - it's the feakiest thing I've seen in a looooong time, too.
ReplyDeleteI can't emphasize enough to go to that "eye candy" photo gallery and look around.
It's fascinating how stuff like this bleeds out into the general consciousness after having been done for years or decades. It isn't difficult to understand how these are generated -- you train a large system of equations to respond to a given input, and then expose it to random noise. Unsurprisingly, the network gives back things that look like the input it was trained to respond to. I was doing this in 1988 with a 486 computer, it's not difficult.
ReplyDeleteThat said, I'm amazed there hasn't been more bleedthru from engineering to art. Perhaps there should be more communication between engineers/scientists/mathematicians and the artistic community, because there is a huge amount of things that can be done artistically with common engineering tools and programs. Perhaps engineers should make more of an attempt to publish their work. Perhaps artists should make more of attempt to train themselves in the scientific arts.
Now go thou and study yourselves some fractal mathematics! ;-)
Ah, yes, doing Mandelbrot sets on a 486; I remember those days! Or was it on a 386? I think I did that in full floating point when I first got my 80387. Then Fractint came around and you didn't have to let the computer sit for hours doing a plot.
ReplyDeleteI think the main reason there's little communications between the artistic community and scientists/engineers is that they look down on us. They think we're such knuckle draggers, what could we possibly add to their exalted value?
Broad brush generalizations, not all artists, yada, yada, yada. And I'm influenced here by an article I saw in the last week by some sort of liberal arts graduate wondering if engineers could even think. After all, if we can't think, how could we possibly give them anything of any value?
And after all, wouldn't you rather look at something with deep meaning like a crucifix in urine or watch someone put sharp objects through their body in some form of "performance art"? What good is a bunch of abstract shapes? That's SO last century.