Follow Us on Twitter
I recently returned from Paris, where I attended both the annual Open Source Think Tank and Open World Forum events. It was really great getting to chat with some of the folks representing the myriad of businesses that have sprung up around Open Source solutions, and having some in-depth discussions about broad industry trends.
The Open Source Think Tank is pretty much a unique event in that it gives attendees the opportunity to examine open source and cloud evolution through detailed analysis and discussions of specific industry related case studies, as well as panels, presentations and networking opportunities with a collaborative group of folks from across the industry.
For its part Open World Forum brings together hundreds of decision-makers, developers and users from across the world to discuss Open technological, business and societal initiatives to help shape the digital future
I was happy to be able to participate in a number of panel discussions at both events. At the Think Tank, I got to brainstorm on the topic of “Open Source Ethos as an Agent of Change," which essentially looked at how closed source companies use the open source ethos to energize their companies and change how they relate to their customers, partners and employees. I was joined by Erynn Petersen of AOL and Gil Yehuda of Yahoo, and a lively conversation ensued.
From a Microsoft perspective I pointed out how we recognize the value of openness in working with a diverse array of OSS communities to help developers, customers and partners succeed in today's heterogeneous IT environments.
I noted that we now have a better appreciation for how the open source development model can be useful for our own software development as well as the potential for Microsoft technologies to be great platforms for open source applications. I also briefly talked about our increased investments in standards, interoperability and integration with Open Source Software.
The second Think Tank discussion revolved around Open Source, Open Systems and Open Standards and what that means today. Larry Augustin from SugarCRM and Yahoo's Gil Yehuda also participated, and a lively discussion ensued, a lot of which was way off topic :-)
In a couple of weeks it will be my one year anniversary here at Microsoft and I couldn’t wish for a better anniversary gift: now that Microsoft has laid out its roadmap for Big Data, I’m really excited about the role that Apache HadoopTM plays in this.
In case you missed it, Microsoft Corporate Vice President Ted Kummert earlier today announced that we are adopting Hadoop by announcing plans to deliver enterprise class Apache Hadoop based distributions on both Windows Server and Windows Azure.
This news is loaded with goodies for the big data community, broadening the accessibility and usage of Hadoop-based technologies among developers and IT professionals, by making it available on Windows Server and Windows Azure.
But there is more. Microsoft will be working with the community to offer contributions for inclusion into the Apache Hadoop project and its ecosystem of tools and technologies.
I believe that all of this will really benefit not only the broader Open Source community by enabling them to take their existing skill sets and assets use them on Windows Azure and Windows Server, but also developers, our customers and partners. It is also another example of our ongoing commitment to providing Interoperability, compatibility and flexibility.
As a proud member of the Apache Software Foundation, I personally could not be happier to see how Microsoft is willing to engage in such an important Open Source project and community.
Technical Considerations
On the more technical front, we have been working on a simplified download, installation and configuration experience of several Hadoop related technologies, including HDFS, Hive, and Pig, which will help broaden the adoption of Hadoop in the enterprise.
The Hadoop based service for Windows Azure will allow any developer or user to submit and run standard Hadoop jobs directly on the Azure cloud with a simple user experience.
Let me stress this once again: it doesn’t matter what platform you are developing your Hadoop jobs on -you will always be able to take a standard Hadoop job and deploy it on our platform, as we strive towards full interoperability with the official Apache Hadoop distribution.
This is great news as it lowers the barrier for building Hadoop based applications while encouraging rapid prototyping scenarios in the Windows Azure cloud for Big Data.
To facilitate all of this, we have also entered into a strategic partnership with Hortonworks that enables us to gain unique experience and expertise to help accelerate the delivery of Microsoft’s Hadoop based distributions on both Windows Server and Windows Azure.
For developers, we will enable integration with Microsoft developer tools as well as invest in making Javascript a first class language for Big Data. We will do this by making it possible to write high performance Map/Reduce jobs using Javascript. Yes, Javascript Map/Reduce, you read it right.
For end users, the Hadoop-based applications targeting the Windows Server and Windows Azure platforms will easily work with Microsoft’s existing BI tools like PowerPivot and recently announced Power View, enabling self-service analysis on business information that was not previously accessible. To enable this we will be delivering an ODBC Driver and an Add-in for Excel, each of which will interoperate with Apache Hive.
Finally, in line with our commitment to Interoperability and to facilitate the high performance bi-directional movement of enterprise data between Apache Hadoop and Microsoft SQL Server, we have released two Hadoop-based connectors for SQL Server to manufacturing.
The SQL Server connector for Apache Hadoop lets customers move large volumes of data between Hadoop and SQL Server 2008 R2, while the SQL Server PDW connector for Apache Hadoop moves data between Hadoop and SQL Server Parallel Data Warehouse (PDW). These new connectors will enable customers to work effectively with both structured and unstructured data.
I really look forward to sharing updates on all this as we move forward. For now, check out www.microsoft.com/bigdata and check back on the DPI blog tomorrow.
Gianugo
More good news on Microsoft's commitment to Interoperability in the cloud: last week Sandy Gupta, the General Manager for Microsoft's Open Solutions Group, announced that Windows Server Hyper-V is now an officially supported hypervisor for OpenNebula.
This open source project is working on a prototype for release next month and it will soon be possible for customers to build and manage OpenNebula clouds on a Hyper-V based virtualization platform.
"Windows Server Hyper-V is an enterprise class virtualization platform that is getting rapidly and widely deployed in the industry. Given the highly heterogeneous environments in today’s data centers and clouds, we are seeing enablement of various Linux distributions including SUSE, CentOS, Red Hat, and CS2C on Windows Server Hyper-V, as well as emerging open source cloud projects like OpenStack -- and now OpenNebula," Gupta said in a blog post.