By enabling models to see the world more like humans do, new research on peripheral vision in AI models could help improve driver safety and shed light on human behavior.
Adam Zewe reports for MIT News on research at MIT that aims to enhance peripheral vision in AI models. Peripheral vision allows humans to perceive shapes outside our direct line of sight, though with less clarity. This capability broadens our visual field, proving beneficial in various scenarios, such as spotting approaching vehicles from the side. In contrast, AI lacks peripheral vision. Integrating this feature into computer vision models could enhance their ability to recognize impending dangers or assess if a human driver would notice an approaching object. MIT researchers have advanced in this area by creating an image dataset designed to emulate peripheral vision in machine learning models. Their findings indicate that models trained with this dataset show improved object detection in peripheral vision but still lag behind human performance. The study reveals that, unlike with human vision, the models' effectiveness was not significantly influenced by object size or visual clutter amount.
Vasha DuTell, a postdoctoral researcher and co-author of the study, highlights the challenge of understanding what sets human vision apart from AI models, as even extensively trained models fell short of human-like perception. This inquiry could lead to the development of machine learning models that view the world more akin to humans, with potential applications in enhancing driver safety and creating more accessible displays. Lead author Anne Harrington and her team—including Mark Hamilton (electrical engineer and computer science graduate student), Ayush Tewari (postdoctoral researcher), Simon Stent (research manager at the Toyota Research Institute), and senior authors William T. Freeman (the Thomas and Gerd Perkins Professor of Electrical Engineering and Computer Science, and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL)) and Ruth Rosenholtz (principal research scientist in the Department of Brain and Cognitive Sciences, and a member of CSAIL)—aim to deepen the understanding of peripheral vision in AI. They believe this could improve predictions of human behavior and contribute to the interaction between humans and machines, emphasizing the critical importance of peripheral vision in this dynamic.
The team adopted a sophisticated method to mimic human peripheral vision in AI models, diverging from traditional techniques that simply blur image edges to represent the declining detail away from the center of focus. This new approach, inspired by the etxture tiling model used in human vision research, allows for a flexible transformation of images without prior knowledge of where the viewer will look. The resulting dataset, full of images that simulate the textural changes seen in human peripheral vision, was used to train various computer vision models. Their performance was then compared to humans in an object detection task, revealing humans' superior ability to detect objects in their periphery.
The research uncovered that models trained from scratch with the new dataset showed significant improvements in detecting and recognizing objects, while fine-tuning pre-existing models offered modest enhancements. Nonetheless, the AI models struggled with object detection in the far periphery and did not emulate human strategies in utilizing context for detection tasks. The team plans to further investigate these disparities, aiming to develop models that can accurately predict human performance in peripheral vision tasks. This could pave the way for AI systems that better alert drivers to unseen hazards.
The significance of this research is underscored by insights from Justin Gardner, an associate professor at Stanford University not involved in the study. He notes the importance of recognizing human peripheral vision not merely as a limitation but as an optimized representation for real-world tasks. He also points out that despite advancements, current neural network models still cannot match human peripheral vision performance, suggesting a need for more AI research informed by the neuroscience of human vision. The database of images simulating human peripheral vision created by the MIT team is expected to significantly aid future research in this area.
Credits
Adam Zewe initially wrote and reported this story for MIT News on March 8, 2024, under the title → "Researchers enhance peripheral vision in AI models." The work was supported, in part, by the Toyota Research Institute and the MIT CSAIL METEOR Fellowship.
Photo © Alexander Krivitskiy.