All of this seems rather straightforward, but consider the radar systems designers in the Tesla incident we started this series with. Did they not consider that the interstate signs would be big reflectors and that their radar would illuminate them? Or did the antenna design get compromised while trying to fit it into the cars body? Remember, Elon Musk tweeted, “Radar tunes out what looks like an overhead road sign to avoid false braking events." Tuning out a return is not a radar guy's words. The software either ignored any return over some level, or they took measures to ensure they never got those high levels, like perhaps aiming the antenna down.
Now let's look at the system that really seems to have the least chance of working properly: artificial vision. A vision system is most likely going to be used to solve an obvious problem: how does the car know where the lane is? That's what a lot of early work in autonomous cars focused on and there are times and conditions where that's not at all trivial. Snow or sand is an obvious concern, but what about when there's road construction and lanes are redirected? Add a layer of rain, snow or dirt on top of already bad or confusing markings and the accuracy will suffer. When the paint is gone or its visibility goes away, what does the system do?
A few weeks ago, Borepatch ran a very illuminating article (if you'll pardon the pun) about the state of AI visual recognition.
This was a recent news piece in the Register (UK). In the mid-80s, I took a senior level Physical Optics class which included topics in Spatial Filtering as well as raytracing-level optics. The professor said (as best I can quote 30+ years later), “you can choke a mainframe trying to get it to recognize a stool, but you always find if you show it a new image that's not quite like the old ones it gets the answer wrong. It might see a stool at a different angle and say it's a dog. Dogs never make that mistake”. Borepatch phrased the same idea this way: “AI does poorly at something that every small child excels at: identifying images. Even newborn babies can recognize that a face is a face and a book is not a face.” Now consider how many generations of processing power have passed between my optics class and the test Borepatch described, and it just seems that the problem hasn't really been solved, yet. (Obvious jokes about the dog humping the stool left out to save time).The problem is that although neural networks can be taught to be experts at identifying images, having to spoon-feed them millions of examples during training means they don’t generalize particularly well. They tend to be really good at identifying whatever you've shown them previously, and fail at anything in between.Switch a few pixels here or there, or add a little noise to what is actually an image of, say, a gray tabby cat, and Google's Tensorflow-powered open-source Inception model will think it’s a bowl of guacamole. This is not a hypothetical example: it's something the MIT students, working together as an independent team dubbed LabSix, claim they have achieved.
Borrowing yet another quote on AI from Borepatch
So why such slow progress, for such a long time? The short answer is that this problem is really, really hard. A more subtle answer is that we really don't understand what intelligence is (at least being able to define it with specificity), and so that makes it really hard to program.That's my argument. We don't know how our brains work in many details - pattern or object recognition is just the big example that's relevant here. A human chess master looks at a board and recognizes patterns that they respond to. IBM's Watson just analyzes every possible move through brute force number crunching. The chess master doesn't play that way. One reason AI wins at chess or Go is that they play games differently than people do, and the people the AI systems are playing against are used to playing against other people.
We don't know what sort of system the Tesla had, whether it was photosensors or real image capture and real image analysis capability, but it seems to be the latter based on Musk saying the CMOS image sensor was seeing “the white side of the tractor trailer against a brightly lit sky”. The sun got in its eye? The contrast was too low for the software to work? It matters. In an article in the Register (UK), Google talked about problems their systems had in two million miles of trials: things like traffic lights washed out by the sun (we've all had that problem), traffic lights obscured by large vehicles (ditto), hipster cyclists, four way stops, and other situations that we all face while driving.
A synthetic vision system might be put to good use seeing if the car in front of it hit the brakes. A better approach might be for cars to all have something like a MAC (EIU-48) address and communicate to all nearby cars that vehicle number 00:80:c8:e8:4b:8e has applied brakes and is decelerating at X ft/sec^2. That makes every car have software that's tracking every MAC address it can hear and determine how much of a threat every car is.
A very obvious need for artificial vision in a car is recognizing signs. Not just street signs, and stop signs, but informational signs like "construction ahead", "right lane ends" and other things critical to safe operation. It turns out Borepatch even wrote about this topic. Quoting that article from the Register, a confession that getting little things right that people do every day is still overwhelming the Google self-driving cars
You can teach a computer what an under-construction sign looks like so that when it sees one, it knows to drive around someone digging a hole in the road. But what happens when there is no sign, and one of the workers is directing traffic with their hands? What happens when a cop waves the car on to continue, or to slow down and stop? You'll have to train the car for that scenario.It's impossible to teach ethics to a computer. It's impossible to teach the computer "if a child runs out after that ball, slam on the brakes, and if you can't stop, hit something like a parked car". A computer isn't going to understand the concept of "child" or "person". Good luck with concepts like "do I hit the adult on the bike or injure my passengers by hitting the parked bus".
What happens when the computer sees a ball bouncing across a street – will it anticipate a child suddenly stepping out of nowhere and chasing after their toy into oncoming traffic? Only if you teach it.
But that's a question for another day. Given the holiday, let's pencil in Friday.
Question for the ADAS: now what?