Connect
Microsoft Research Connections Blog
Next at Microsoft
Social Media Collective
Windows on Theory
Posted by Rob Knies
Last year, David Rothschild of Microsoft Research New York City used a versatile, data-driven model to predict correctly the results of the U.S. presidential election in 50 of 51 jurisdictions—the nation’s 50 states and the District of Columbia.Given the overwhelming accuracy—better than 98 percent—of those predictions, it’s no wonder that the work of Rothschild and a few other individuals trying to learn how to harness the value of big data gained the attention of the news media. “Some things,” wrote Steven Cherry in IEEE Spectrum, “are predictable—if you go to the people who rely on data and not their gut.”People, in other words, like Rothschild, who readily admits that his role is to “push the boundaries of information aggregation.”Now, as the next effort in his quest to make use of big data to reinvent how we think about predictions and forecasting—and, coincidentally, to make potential contributions to enable Microsoft to build better products and services—Rothschild has turned his predictive attention toward another major media event of global proportions: the Academy of Motion Picture Arts and Sciences' 85th annual Academy Awards.You can see his latest forecasts on his PredictWise blog. As part of that effort, he is collaborating with the Office team to power its Oscars Ballot Predictor, an Excel app that provides real-time predictions in all 24 Oscar categories. For Rothschild, it’s just part of an honest day’s work.“I approach forecasting the Oscars the same way I approach forecasting anything, including politics,” he says. “I look for the most efficient data, and I create statistically significant models without any regard for the outcomes in any particular year. All models are tested and calibrated on historical data, with great pains taken to ensure that the model is robust to “out-of-sample” outcomes, not just what has happened in the past. The models predict the future, not just the past.“Thus, the science is identical, but there are differences in which data prove most useful.”You might think that a forecasting model able to conquer the vagaries of something as volatile as a presidential election, with nearly 127 million votes cast, would be a cinch for success in something less complex, such as Oscars balloting with a voting membership of fewer than 6,000. But those data differences are significant. “There are four different types of data I generally focus on: polling, prediction markets, fundamental data, and user-generated data,” Rothschild says. “For politics, I use fundamental data—such as past election results, incumbency, and economic indicators—early in the cycle to establish a baseline and then switch to prediction markets and polls later in the cycle as they absorb and contain more and more information about the election. I used user-generated data for the 2012 elections sparingly, but the Xbox LIVE data was crucial in supplementing real-time analysis of major events.“For the Oscars, there is no polling, and fundamental data—box-office returns, movie ratings—are not statistically effective. I focus even more heavily on prediction markets, which are very robust, but I also include some user-generated data that helps me learn more about correlations within movies and between categories, such as, ‘How many categories will Lincoln win?’”It’s instructive to hear Rothschild sketch his plan for tackling a project such as his Oscars predictions.“Whenever I focus on a new domain,” he says, “I want to consider a few key things in making a meaningful forecast:
So, Mr. Rothschild, on Feb. 24, the Oscars will go to …?“I am making some very strong predictions this year,” he responds, providing likelihoods for victory:
Those numbers are preliminary, of course, but Rothschild isn’t concerned.“I am pretty darn confident,” he says, “but the predictions are not 100-percent confident for a reason, so I look forward to seeing how we do on Oscar night!”
You close. Very....
hmm nice <a href="http://www.mybloggertrick.com">blogs.technet.com</a>