Microsoft Research Connections Blog
Next at Microsoft
Social Media Collective
Windows on Theory
This week at Techfest, the Technologies for Emerging Markets team at Microsoft Reearch India is showing a lightweight, inexpensive system to instantly gather responses from students in classrooms.
Posted by Rob Knies
You’re looking for a photo of a flower. Not just any photo—it needs to be horizontal in shape. And not just any flower—it needs to be a purple flower.What do you do? You could perform a conventional image search on the web. There are lots of flowers out there—lots of shapes, lots of colors. Poke around for a while, and you just might find what you need.Alternatively, you can use the filter bar in Bing Image Search, which has been augmented by work from Microsoft Research Asia. You type in a textual query: “flower” and filter for “purple,” “photograph,” and “wide,” and voilà, a collection of horizontal shots of purple flowers pops up.The color filter is thanks, in large part, to research by Jingdong Wang and Shipeng Li. They are in Providence, R.I., from June 16 to 21, attending the Institute of Electrical and Electronics Engineers’ 2012 Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2012), during which they are presenting their paper Salient Object Detection for Searched Web Images via Global Saliency, written in collaboration with Peng Wang, Gang Zeng, Jie Feng, and Hongbin Zha of the Key Laboratory on Machine Perception at Peking University.
Last August, my colleague Janie Chang wrote a feature story titled Speech Recognition Leaps Forward that was published on the Microsoft Research website. The article outlined how Dong Yu, of Microsoft Research Redmond, and Frank Seide, of Microsoft Research Asia, had extended the state of the art in real-time, speaker-independent, automatic speech recognition.Now, that improvement has been deployed to the world. Microsoft is updating the Microsoft Audio Video Indexing Service with new algorithms that enable customers to take advantage of the improved accuracy detailed in a paper Yu, Seide, and Gang Li, also of Microsoft Research Asia, delivered in Florence, Italy, during Interspeech 2011, the 12th annual Conference of the International Speech Communication Association.The algorithms represent the first time a company has released a deep-neural-networks (DNN)-based speech-recognition algorithm in a commercial product.It’s a big deal. The benefits, says Behrooz Chitsaz, director of Intellectual Property Strategy for Microsoft Research, are improved accuracy and faster processor timing.
Sign language is the primary language for many deaf and hard-of-hearing people. But it currently is not possible for these people to interact with computers using their native language.Because of this, researchers in recent years have spent lots of time studying the challenges of sign-language recognition, because not everyone understands sign language, and human sign-language translators are not always available. The researchers have examined the potential of input sensors such as data gloves or special cameras. The former, though, while providing good recognition performance, are inconvenient to wear and have proven too expensive for mass use. And web cameras or stereo cameras, while accurate and fast at hand tracking, struggle to cope with issues such as tricky real-world backgrounds or illumination when not under controlled conditions.Then along came a device called the Kinect. Researchers from Microsoft Research Asia have collaborated with colleagues from the Institute of Computing Technology at the Chinese Academy of Sciences (CAS) to explore how Kinect’s body-tracking abilities can be applied to the problem of sign-language recognition. Results have been encouraging in enabling people whose primary language is sign language to interact more naturally with their computers, in much the same way that speech recognition does.
The set was simple: a simulated office, with a desk, a chair, a floor lamp, a wall calendar, a row of bookshelves packed with scores of academic journals, a scraggly-looking plant at stage right—“a depressingly faithful reproduction of my office,” said Stephen Emmott.The latter, head of Computational Science at Microsoft Research, based in Cambridge, U.K., was greeting a July 14 audience, 80 strong, in the Jerwood Theatre Upstairs at the Royal Court Theatre in the Chelsea area of West London for the third of 27 performances of Ten Billion.The one-man, hour-long performance, a collaboration with British stage director Katie Mitchell, was billed as an exploration of the future of life on Earth, and the pairing was intriguing. As noted in this space back in May, Mitchell is one of the United Kingdom’s pre-eminent theatrical figures. Emmott, also a professor at the University of Oxford and University College London, leads the Computational Science Lab, which focuses on developing a new kind of precise, predictive science of complex systems.In short, Ten Billion reflected nothing less than a rarely visited intersection of science and art, reflected in Emmott’s comment about the engagement: “It’s not a play, it’s not a typical scientific talk. It’s an experiment.”