For quick review, a camera takes a picture of the scene, one frame, at whatever the frame rate is. A movie is a sequence of vast numbers of these frames. In a film movie, all of the frames are projected one a time onto a screen, and your eye/brain combination perceives it as smoothly moving. Video cameras originally exposed a small sensor in a pattern of lines called a raster, re-tracing to start another "scan line", and progressing down the screen line by line. Today, the cameras hold solid state sensors like the ones in a digital camera that capture an entire frame at once, but those sensors are still read out in the same basic line-by-line way. In the video realm, the rate varies with the mode, but runs from around 30 to 60 frames per second here in the US. Each one of those frames is conveyed to the user; either through video transmission or recording, and is played back at that same rate.
In a sense, this was the only way video could have been developed: one photograph at a time, followed by one line at a time transmitted as an analog signal. But think about a video of a scene being updated 60 times every second. The vast majority of video is identical from frame to frame. In a piece on radio communications, I briefly mentioned Shannon's Information Theory; let me give the 5 minute University version: if I tell you something you already know, I haven't provided any information. That means the video transmission is horribly inefficient, sending the same scene over and over - 60 times a second! - with no information being sent.
But what if we looked at the scene differently? What if we looked at the whole frame and only told the receiver which pixels had changed? If nothing changed - the characters in the SitCom just sat there for one frame and nobody moved - nothing would be transmitted. That would drag the data being sent down tremendously, but what was sent would be all new information. The data rate required for your cable system or other transmission just got hundreds of times lower while the information throughput stayed the same (only the pixels that changed are sent). To some extent, that's the basis of how compression algorithms attempt to work; by attempting to send what's changing, and not update the parts that aren't changing. The difference is changing from a frame based system, to an event based system.
It turns out that the brain and eye work in a similar way. Years ago, I read about an experiment in which a dog's brain had been surgically tapped so that optic nerve activity could be measured. When the dog first went into a new room, the activity was intense. After a while in the room, with nothing moving or changing, nerve activity went down to a much lower rate. The experimenters introduced something new into the room and again the optic nerve started firing like crazy. Once the new object was examined and accepted, the rate went down again. The conclusion was that the eye was operating like a distributed processor, looking at everything and updating what changed.
The idea is now being pursued in the electronics industry, primarily motivated by the much-hyped "Internet of Things" that so many of us talk about. Vision is being added, or going to be added, to tons of systems. The thought is that it might help to think about vision completely differently. Consider a security camera. Might it make more sense for a camera watching a door to sit and not send anything unless something approached the door, rather than send a non-changing image of the door over and over, 30 frames per second? Think of it as being a sensor rather than an imager.
French startup Chronocam comes from the vision research world. Their two principal researchers are Ryad Benosman, a mathematician who has done original work on event-driven computation, retina prosthetics, and neural sensing models while Christoph Posch has worked in neuromorphic analog VLSI circuits (very large scale integration), CMOS image and vision sensors, biology-inspired signal processing, and biomedical devices and systems. The inspiration for Chronocam’s event-driven vision sensors comes from studies of how the human eye and brain work.
According to Benosman, human eyes and brains “do not record the visual information based on a series of frames.” Biology is in fact much more sophisticated. “Humans capture the stuff of interest—spatial and temporal changes—and send that information to the brain very efficiently,” he said.This is really just the early days of event-driven vision as a field. The people conceptualizing the use of machine vision for systems that are coming in four to five years are thinking about this now. It's the time for the companies like Chronogram who are selling all new technologies and all new ideas to be getting their technical vision in place (pun intended).