In a study published in the Proceedings of the National Academy of Sciences, a team of researchers have engineered an advanced computer system by basing it on the same method of visual learning that humans use. This advanced type of technology is referred to as computer vision: the ability of computers to read and identify visual images. The study is a step closer to improving general artificial intelligent systems.
The “computer vision” system can identify objects based on only partial glimpses, like by using these photo snippets of a motorcycle. Credit: UCLA
Ideas were drawn from cognitive psychology and neuroscience in order to create framework for the development of the system. "Starting as infants, we learn what something is because we see many examples of it, in many contexts," explained principal investigator, Vwani Roychowdhury. "That contextual learning is a key feature of our brains, and it helps us build robust models of objects that are part of an integrated worldview where everything is functionally connected."
The system is made of three approaches. The first approach consists of the breaking of an image into small chunks which authors of the study refer to as "viewlets." The second approach is where the computer learns how the viewlets puzzle together to ultimately form an object in question. The last and third approach is how the computer interprets other objects in the surrounding area and it relevance to identifying the main object.
The “computer vision” system can identify objects based on only partial glimpses, like by using these photo snippets of a motorcycle. Credit: UCLA
Researchers immersed the new system in an internet replica of the human environment in order to assist in the learning process. They tested 9,000 images displaying people and other objects.
"Fortunately, the internet provides two things that help a brain-inspired computer vision system learn the same way humans do," says Roychowdhury. "One is a wealth of images and videos that depict the same types of objects. The second is that these objects are shown from many perspectives -- obscured, bird's eye, up-close -- and they are placed in different kinds of environments."