Microsoft Research Connections Blog
Next at Microsoft
Social Media Collective
Windows on Theory
This year, the IEEE Technical Committee on Pattern Analysis and Machine Intelligence, the organizer of the International Conference on Computer Vision (ICCV), inaugurated the Helmholtz Prize, a test-of-time award presented for papers published at least 10 years ago that continue to influence the field of computer vision.
Given the huge strides the field has achieved, identifying papers with the greatest impact loomed as a daunting challenge, but for the authors of the papers, the results were gratifying—particularly for a couple of scientists from Microsoft Research.
During ICCV 2013, held in Sydney from Dec. 3-6, P. Anandan and Zhengyou Zhang had seminal papers recognized in the first set of Helmholtz Prize honorees.
Anandan, a Microsoft distinguished scientist and managing director of Microsoft Research India, shared the award with colleague Michael Black for A framework for the robust estimation of optical flow, presented during ICCV 1993.
Zhang’s paper, Flexible Camera Calibration by Viewing a Plane from Unknown Orientations, for which he was the sole author, was accepted for the conference six years later.
While both researchers have undertaken many projects over the ensuing years, their memories of their Helmholtz Prize-winning work remain fresh.
“This paper was a culmination of 10 years of research, starting with my Ph.D. thesis work,” recalls Anandan, a research manager at the David Sarnoff Research Center when the paper was written. “I published the first version of the algorithm in 1987, at the first ICCV conference, in London. The work continued after I joined Yale as a faculty member in computer science, when Michael Black joined as a student and worked under my supervision. He extended and expanded the algorithm, making it more robust, efficient, and general. This was the algorithm that was published in 1993 at ICCV in Berlin—and that won the award.”
Zhang’s recollections are similarly precise.
“This was done 15 years ago at Microsoft Research,” says Zhang, a principal researcher and manager of the Multimedia, Interaction, and Communication group at Microsoft Research Redmond. “After I joined Microsoft Research, I wrote a memo entitled Desktop Vision for my research plan. I was envisioning that, one day, every desktop computer would come equipped with a camera that was capable of perceiving the surrounding 3-D environment and understanding the emotion of its user. The first step was to know the camera parameters, such as the focal length, the pixel-aspect ratio, the center of the image sensor, and the camera’s position and orientation. This is the process of camera calibration.”
His paper detailed a flexible technique for making that easy, requiring the camera only to observe a planar pattern shown at a small number of different orientations.
“The classical approach was to use a precisely fabricated 3-D apparatus, which is expensive and inconvenient,” Zhang says. “I wanted to devise a flexible calibration technique. With my knowledge of projective geometry I gained through my previous research career at INRIA [the French Institute for Research in Computer Science and Control], I discovered that we could calibrate a camera by just showing a planar pattern at a few different orientations. Either the camera or the planar pattern can be freely moved, and the pattern can be printed on paper, then attached to a planar surface such as cardboard. It is really easy to use and flexible.”
Anandan and Black saw their algorithm gain rapid acceptance. Given a pair of images, optical flow tracks every pixel in one frame to the next, generating a series of vectors for every pixel in the shot. With analysis, advanced image processing can identify shapes and objects in the images. The technique quickly gained high-profile adherents.
“We knew that the algorithm was popular even at that time,” Anandan says. “Michael had made it available as a free download on his webpage, and it was downloaded by a number of organizations, including those in the movie industry, where it was widely used.”
Millions of filmgoers have reaped the benefits while viewing such blockbusters as Mission: Impossible and the Matrix series.
“Since then, the work on optical flow has continued,” Anandan adds, “but it appears, after two decades, that the algorithm we published in 1993 is still one of the best, and many other algorithms include some of the key features of ours.”
It’s been about a decade since Anandan has been involved in optical-flow projects, but Black continues to advance the work and is considered the world’s leading expert on the topic, now as director of the Max Planck Institute of Intelligent Systems, located in Tübingen, Germany.
Zhang, too, sees continued, robust interest in his work that, 14 years later, resulted in a Helmholtz Prize. His ICCV paper, and its successor published in IEEE Transactions on Pattern Analysis and Machine Intelligence, have been cited more than 7,400 times.
In fact, it could well be the most widely used camera-calibration algorithm extant. The technique outlined in Zhang’s paper is employed in computer-vision research labs the world over, as well as by many companies. A version calibrated the vision system in NASA’s Mars Rover, and another version is used to recalibrate Kinect sensors.
“I developed the technology for my own need,” he says. “It is useful for me, and it should be useful for others, too.
“Did I have any idea that the paper would have this sort of lasting impact? No, but I did say in my paper that ‘it advances 3-D computer vision one step, from laboratory environments to real-world use.”