• Top Posts of 2014: Deep Learning, Predictive Analytics, and Human-Computer Interaction

    Posted by Microsoft Research

    Our most popular blog posts of 2014 reflect the breadth of our research and our collaborative efforts across multiple product groups as well as with external organizations worldwide. Also see the top feature stories of 2014.

    Bringing Big-Data Dreams Down to Earth
    Holograph, an interactive, 3-D data-visualization research platform that can render static and dynamic data above or below the plane of a display using a variety of 3-D stereographic techniques, is featured.

    Krysta Svore, David Rothschild
    Krysta Svore and David Rothschild

    Unraveling the Mysteries of Quantum Computing
    Researcher Krysta Svore, whose team is developing a full software architecture for quantum computing, discusses a potential future for machine learning: "We know quantum computers give exponential speedups for some problems. The question is: Can we get exponential speedups for problems in big data, data analysis, and machine learning?"

    Seeking Answers amid World Cup Excitement
    Researcher and economist David Rothschild elevates the science of predictive analytics to new heights, be it politics, entertainment, or sports.

    Share Your Photos, Not Your Phone
    Xim, a free app for Windows Phone, Android, and iOS, enhances social interactions and experiences by enabling nearly effortless photo sharing, be it face-to-face or in a remote setting.

    Making Cortana the Researcher’s Dream Assistant
    Cortana, the personal assistant for Windows Phone 8.1 powered by Bing, gets an infusion of academic data tightly integrated and prominently featured on its search pages. 

    The Catapult team
    The Catapult team

    Catapult: Moving Beyond CPUs in the Cloud
    In a collaborative project called Catapult, Microsoft researchers and colleagues from Bing describe an effort to combine programmable hardware and software that uses field-programmable gate arrays (FPGAs) to deliver performance improvements of as much as 95 percent.

    OSDI '14 Highlight: Preserving Trust in the Cloud
    Haven, the first system to achieve shielded execution of unmodified legacy applications, including SQL Server and Apache, on a commodity OS (Windows) and commodity hardware, is awarded Best Paper at the 11th USENIX Symposium on Operating Systems Design and Implementation. Other OSDI 2014 papers are featured. 

    Microsoft Research Adopts Open Access Policy for Publications
    The open-access policy underscores Microsoft’s commitment to promoting open publication of all research results and encouraging deep collaborations with academic researchers.

    Young Talent Gathers at Microsoft Research Asia PhD Forum
    The past decade has witnessed an incredible boom in Chinese academic research—a boom fueled in large measure by talented young researchers. Microsoft Research Asia’s Joint PhD Program collaborates with leading Chinese universities to discover and foster outstanding research talent. 

    What if Coding Were a Game?
    Code Hunt, a browser-based game, teaches coding as a by-product of solving a problem that is presented as pattern matching inputs and outputs. The fun is in finding the pattern.

  • New Research Brings Precision to Sampling Methods Used in Statistics and Machine Learning

    Posted by George Thomas Jr.

    Addressing one of the core problems in statistics and machine learning, Microsoft researchers have developed a new, more efficient algorithm that enables exact sampling.

    Daniel Tarlow and Tom Minka
    Daniel Tarlow and Tom Minka

    Researchers Daniel Tarlow, Tom Minka, and former Microsoft intern Chris Maddison introduced the algorithm in their paper, A* Sampling, one of only two of the 1,700 submitted that received an Outstanding Paper Award at NIPS 2014, the renowned machine learning conference of the Neural Information Processing Systems Foundation.

    “This research makes a very significant advance in the efficiency of sampling, which is a core component of probabilistic modelling and reasoning systems,” said Andrew Blake, Distinguished Scientist and Laboratory Director of Microsoft Research Cambridge.

    In their paper, the authors present a new construction of the Gumbel process and A* sampling -- an algorithm that searches for the maximum of a Gumbel process using A* search -- and explain how their approach, from a different perspective, enables exact sampling, whereas previous solutions were forced to resort to approximate sampling.

    Illustration of A* sampling

    Illustration of A* sampling

    Citing inspiration from an algorithm for sampling from a discrete distribution known as the “Gumbel-Max trick,” the authors note how an exact sample results by adding independent Gumbel perturbations to each configuration of a discrete negative energy function and returning the argmax configuration of the perturbed negative energy function.

    “Our first key observation,” the authors write, “is that we can apply the Gumbel-Max trick without instantiating all of the (possibly exponentially many) Gumbel perturbations. The same basic idea then allows us to extend the Gumbel-Max trick to continuous spaces where there will be infinitely many independent perturbations.

    “Intuitively, for any given random energy function, there are many perturbation values that are irrelevant to determining the argmax so long as we have an upper bound on their values. We will show how to instantiate the relevant ones and bound the irrelevant ones, allowing us to find the argmax -- and thus an exact sample.”

    Tarlow’s hope for the future is these findings will lead to probabilistic reasoning systems that are more powerful and easier to use than current systems. “When we can provide stronger guarantees about the quality of outputs from our inference algorithms, it becomes easier to use these algorithms inside larger systems and to build tools that can be used reliably by non-experts,” he said.

    Machine learning is a key focus of Microsoft Research (@MSFTResearch) and has led to numerous product contributions that include Microsoft Office, SQL Server, Xbox One, Cortana speech recognition, and Skype Translator.

    The Neural Information Processing Systems Foundation is a non-profit corporation whose purpose is to foster the exchange of research on neural information processing systems in their biological, technological, mathematical, and theoretical aspects.

  • Addressing Fairness, Accountability, and Transparency in Machine Learning

    Posted by Microsoft Research

    Machine learning and big data are certainly hot topics that emerged within the tech community in 2014. But what are the real-world implications for how we interpret what happens inside the data centers that churn through mountains of seemingly endless data?

    Hanna WallachFor Microsoft machine learning researcher Hanna Wallach (@hannawallach), opportunity lies outside the box. As an invited speaker at the NIPS 2014 workshop on Fairness, Accountability, and Transparency in Machine Learning, Wallach spoke about how her shift in research to the emerging field of computational social science led her to new insights about how machine learning methods can be applied to analyze real-world data about society.

    Her talk, "Big Data, Machine Learning, and the Social Sciences,” now available online, focuses on the four keys that she says lie at the heart of the matter: data, questions, models, and findings.

    "Within computer science, there’s a lot of enthusiasm about big data at the moment," Wallach says. "But when it comes to addressing bias, fairness, and inclusion, perhaps we need to focus our attention on the granular nature of big data, or the fact that there may be many interesting data sets, nested within these larger collections, for which average-case statistical patterns may not hold."

    A researcher at Microsoft Research New York City, Wallach also is a core faculty member in the recently formed Computational Social Science Initiative at the University of Massachusetts Amherst.