I can't help but observe the "discussion" underway with respect to spreadsheet interoperability that Rob Weir has started. Essentially Rob is complaining that Microsoft didn't implement the formula namespace of OpenOffice.
For the chair of the committee to post vitriol like this about the implementation of his own format raises a number of very concerning problems.
I'd like everyone reading the post to know that Rob was invited to participate in the DII events leading up to the SP2 release, and offered the opportunity to test the beta software specifically for the purpose of providing feedback on the implementation. Normally the chair of group of the standard being implemented would jump at the chance. Rob didn't, electing instead to wait for the shipping version and then claim that it is somehow deficient to other ODF implementations that he has deemed suitable for his purposes.
Does it make sense to have a chair for the ODF TC whose apparent mission is to create a caste system for ODF implementers? Do we really think Rob, who debates whether the tough (and publicly vetted) implementation decisions of his constituents are "malice" or "incompetence?" – is this the hallmark of a leader in the standards community striving for innovation using open technologies? Is this the characteristic that OASIS wants to promote in the development of technology standards? In Rob, do we really have a person capable of operating in a vendor-neutral forum? If departments within 18 various governments really do use ODF as their standard, should we be comfortable with an ODF TC chair that is trying very hard to discredit and divide its supporters?
Is it time for Rob to step down as chair? I think so.
I'm not saying Microsoft (or anyone) should be the chair instead, but I am saying that Rob is unfit as a leader given his inability to separate his personal venom from his role as a leader in driving the standard forward. It seems like a better approach to empower people on the ODF TC who have a long-term view of the need to enable interoperability, and to move those with more short-term vendor-oriented agendas to the side.
John Head is on point with this post. eWeek seems to be fine with SP2.
As far as I can see, the only thing that Rob is really demonstrating here is that the "grossly inadequate" formula support of ODF (those are the words of David Wheeler, leader of OpenFormula, read on for details) is causing problems with vendors implementing the standard. He instead resorts to scoring implementations based on a percentage of common ground, rather than conformance to something written on paper. This gives Rob the freedom he needs to define his own criteria for what ODF implementation is, and who is doing it according to his rules.
Rob seems to be positioning himself as the final arbiter on what is "good" ODF vs. "bad" ODF. OASIS? specification? – Unimportant when Rob Weir can arbitrarily define criteria for what he thinks is good. He's in a position where only he will declare his own ODF preferences as the blessed implementation. It seems that neither the ODF TC nor the spec matter anymore. It seems that ODF is being run by an individual.
Current ODF standards do not support formulas no matter how much Rob wishes it to be so. Implementations of ODF spreadsheets are application-dependent. ODF 1.2 is not an approved standard. OpenFormula is not an approved standard. While it may be that both are on a path to standardization in the future, today they are not. This is a situation that has been known to the ODF TC for more than 4 years, yet no solution based on an approved standard (other than Open XML) has been found. These are all indisputable facts.
In his post, Rob proposes using "legacy OO namespaces" (also declaring OpenOffice as the "current convention"). Rob's suggestion to use "legacy OO namespaces" is a reference to a vendor's product and indicates favoritism to a particular implementation. The defender of "precise, repeatable, common" seems to be abandoning that hill, hoping instead to claim for his own the dialog that Microsoft has been conducting for a long time: Interoperability requires the participation of many, and will not be defined by a standard alone. Doug covers that pretty well I think.
The irony isn't lost at all. This is the same guy who went to such a length to chastise Open XML for its undefined list styles and compatibility settings. For some reason his expectations of Open XML seem to be somewhat higher than they are for the committee he chairs. For some reason, it is ok for Rob to patch glaring holes in ODF as "current convention" and then complain vigorously about alleged dependence on Microsoft Office for implementing Open XML. This is shameful, hypocritical and warrants corrective action.
It wouldn't be such a huge deal if the tone were constructive or aimed at improving the situation. It seems he is only interested in distancing himself from scenarios where ODF can be used successfully with Microsoft Office (as well as the DII discussions where that implementation was discussed in detail during its development. Funny that he didn't show up there to share this feedback.)
Rob's conclusion on the cause of that problem:
"I was taught to never assume malice where incompetence would be the simpler explanation. But the degree of incompetence needed to explain SP2's poor ODF support boggles the mind and leads me to further uncharitable thoughts. So I must stop here"
Let's just remember that it was the ODF TC which deemed formulas "out of scope," and after 4 years, still have no solution for standardizing the definition of "Sum = 2+2." Rob says "Everyone knows what =A1+A2 means." Really Rob? What does it mean if A1 contains 1, and A2 contains "two"? Would it surprise you to learn that Excel and OpenOffice produce different answers in that case? Which one is correct? This question and a thousand more like it is why formula interoperability is hard work, and not at all the trivial matter Rob claims it is.
During the original discussion within the ODF TC, not everyone agreed with the omission of formulas from the spec… David Wheeler seemed to be pretty clear when commented on this on February 7th, 2005:
This previous comment scares me: "There are from our point of view also no interoperability issues, because the namespace prefix mechanism we have specified unambiguously specifies what syntax and semantics are used for a formula". Here's how I read that: "Every implementation must reverse engineer all other implementations' namespaces (they're not in the spec, so everyone's free to invent their own private incompatible namespaces). Then, every implementation must implement all the syntax and semantics of all other implementations' namespaces for formulas, if they wish to achive interoperability. And oh, by the way, your implementation might not implement the namespace for the document you're trying to load, so you may lose all the formulas."
I'm sure that's not what was meant, but that's how it reads to me. I hope that helps explain why I think that the current formula information in the OpenOffice specification is grossly inadequate."
So… maybe it's too easy, but "I was taught to never assume malice where incompetence would be the simpler explanation." David Wheeler saw this coming over 4 years ago, and yet, OpenFormula is not a standard today, and ODF has no definition for spreadsheet formulas. Rob tries to excuse his way around this in his post, but these comments are made by the committee that he chairs. I'll leave it to you, then, to decide between "malice" or "incompetence" of the poster who would elect to throw his own committee under the bus to get hits on his blog… or fail to take this very good advice.
By the way, it is worth noting the response to this stern (and very accurate) prediction.
"Hi David,Thanks for the concerned comments and all the considerable effort you have put into solving this problem. You're challenging us all to go where none have dared tread before. So go ahead and lead the way. You have the TC's attention. We are listening. As you grind out the grit of your proposal, please keep in mind that we have to fit proposed solutions into the politic of work that has already been done. A politic that represents years of work that is just now on it's way to ratification at OASIS, and beyond to ISO. Keep in mind also that the ISO certification comes at the request of the European Union. Time is of the essence. Ratification perhaps trumps perfection. At least for the moment."
This comment was from Gary Edwards, (he of "cracks in the foundation" / OpenDocument Foundation fame) who eventually left the TC and shuttered the OpenDocument Foundation. I seem to remember some dialog from Rob about Open XML being "rushed" through standardization. Funny how those things come back to haunt you.
I'm very discouraged by Rob's post. As far as I can tell, rob is playing a shell game where only his definition will be good enough for supporting ODF, and that definition will change to whatever Microsoft isn't doing.
This is far from constructive. This is not a way to foster interoperability and industry dialog. This is not a leader for people to follow.
I could make a full time job of tearing down the "say anything we possibly can" approach to Open XML opposition. Seems like we're seeing a new level of arm-flailing and finger pointing, now that we are weeks away from the close of the post BRM period. I wanted to offer some comments about the SFLC's analysis of the OSP. This is an unfortunate report, these all represent issues that have been raised in a campaign that includes innuendo and supposition, leaving out inconvenient information and language and ignoring the same, similar, or less attractive, language that exists for ODF.
The big news in this is their admission/confirmation that the OSP is in fact compatible with the GPL. They say "The OSP cannot be relied upon by GPL developers for their implementations not because its provisions conflict with the GPL but because it does not provide the freedom that the GPL requires." They go on to identify that "freedom" being linked to the OSP being unsafe is because new versions of the specifications could be excluded from the OSP in the future.
It is unusual for promises like the OSP to automatically include every spec or all future versions (IBM's pledge is exactly like ours). The norm is for new versions to be added to them to be covered. In the case of Sun's statement new versions are automatically added only when they participate in the development of the new version to the extent that the OASIS IPR rules would then obligate them to provide patent rights under the OASIS IPR Policy. None of these promises include future versions of the specifications without any qualification.
Let's deal with the points one by one:
This section points out that the OSP only applies to listed versions of covered specifications. True, except that we have already committed to extending it to ISO/IEC DIS 29500 when it is approved in our filing with ISO/IEC. For ODF, IBM in their ISP takes the identical approach. Strange how things that seem appropriate for ODF are not appropriate for Open XML.
Not true. The OSP is a promise to not assert patents that are necessarily infringed by implementations of covered specifications. Like all similar patent non-asserts (including the Sun and IBM versions for ODF) the promise covers that part of a product that implements that specification (and not other parts that have nothing to do with the specification). While the Sun covenant is silent about conformance to the specification, the OSP allows implementers the freedom to implement any (or all) parts of a covered specification and to the extent they do implement those portions (also known as conform to those parts) they are covered by the promise for those parts. Contrast that to the IBM pledge that requires total conformance and so programming errors or absence of something required by the spec (but not by an implementer's product) means that the promise is totally void for that product.
Not true. As far as we are concerned we are happy to extend the OSP to implementers who distribute their code under any copyright license including the GPL. The FAQ cited just states what everyone knows and acknowledges, the GPL is a copyright license that is drafted in a way that leaves many issues (not just those related to patent rights) open to many interpretations. Any particular user or implementer should read the GPL carefully and make their own judgment about what it means and requires in accordance with their own circumstances. The FAQ states that Microsoft is not in a position to give blanket advice about the GPL to others. They missed these two FAQs for some reason...
"Q: Is the Open Specification Promise intended to apply to open source developers and users of open source developed software?
A: Yes. The OSP applies directly to all persons or entities that make, use, sell, offer for sale, imports and/or distributes an implementation of a Covered Specification. It is intended to enable open source implementations, and in fact several parties in the open source community have specifically stated that the OSP meets their needs. Moreover there are already a significant number of implementations of Covered Specifications that have been created and/or distributed under a variety of open source licenses as well as under proprietary software development models. Because open source software licenses can vary you may want to consult with your legal counsel to understand your particular legal environment.
Q: Is this Promise consistent with open source licensing, namely the GPL? And can anyone implement the specification(s) without any concerns about Microsoft patents?
A: The Open Specification Promise is a simple and clear way to assure that the broadest audience of developers and customers working with commercial or open source software can implement the covered specification(s). We leave it to those implementing these technologies to understand the legal environments in which they operate. This includes people operating in a GPL environment. Because the General Public License (GPL) is not universally interpreted the same way by everyone, we can't give anyone a legal opinion about how our language relates to the GPL or other OSS licenses, but based on feedback from the open source community we believe that a broad audience of developers can implement the specification(s)."
I recall saying recently that "this is my last post for 2009." Whoops... I don't think I was anticipating this. I watched with interest yesterday the coverage and reaction to the i4i judgment. I am not keen to share my own thoughts about the case here, but I would like to offer clarity around the specific area of Word in question, and suggestions for what people can do about it if they are using that functionality today. There is much confusion about the part of Word that is actually affected.
First, some things to understand:
We do not anticipate any interruption in the availability of Word or Office 2007. Additionally this ruling has no impact on the scheduled availability of the 2010 Office version which is planned for the first half of CY2010.
Current users are not affected. If you are using the custom XML tags in Word 2003 or 2007 (these show up in Word as Pink Tags around tagged content), you are free to continue doing so with the products you have already purchased.
Open XML standards (all ECMA and ISO versions) are not affected. Even if Word's specific implementation of custom XML support does infringe the i4i patent (which Microsoft does not believe to be the case), i4i has never claimed that its patent is essential to the OXML standard.
Content Controls of Word (screen shot below) are not affected. In Word 2007 and Word 2010, this is a common method of binding document content to data stored in a custom-defined schema within a document.
The functionality that is in question is indicated by the screen shot below. Custom XML Tags in Word documents are visible in the Word user interface as Pink Tags surrounding tagged content in a document.
What you can do if you have questions about your solutions that use Custom XML Tags:
First, download the Office 2010 beta and test your solution. If your solution works in Office 2010, it does not depend on the functionality in question. If your solution does utilize Custom XML Tags, consider re-implementing the solution using Content Controls. Detailed guidance on the use of Content Controls in Word 2007 can be found here. Also note the Word Content Controls Toolkit on CodePlex. The Open XML SDK, of course, is quite useful for getting people up to speed on developing solutions for Word and Open XML.
Update: Additional Detail
In response to several inquiries on the topic, I have included additional text describing the feature area that is affected vs. what is not affected, including links to KB articles which illustrate the capabilities in more detail.
Affected:
Word 2003 and Word 2007 distributed prior to 1/11/2010 can read files that contain XML markup (ref: “Understanding Word's XML Markup [Word 2003 XML Reference]”, http://msdn.microsoft.com/en-us/library/aa212889(office.11).aspx. When custom XML markup is present, Word delineates this content in a Word document which allows it to later save the file to .DOCX, .DOCM, or .XML with that content marked up.
The Word 2007 product distributed by Microsoft after 1/10/2010 will no longer read the Custom XML markup contained within .DOCX, .DOCM, or .XML files. These files will continue to open, but the Custom XML markup tags will be removed. Custom XML markup stored within .DOC files will not be affected by these changes. Word 2003 and existing installations of Word 2007 will not be affected by this change.
Not Affected:
Word 2007 also added features allowing Content Controls to map to XML data stored in a DOCX or DOCM file (ref: “Mapping Word 2007 Content Controls to Custom XML Using the XML Mapping Object”, http://msdn.microsoft.com/en-us/library/bb510135.aspx). Content Controls and XML data stored within DOCX or DOCM files will not be affected by this change.
Updated. April 28, 2009. Office 2007 SP2 is available for download. For more information: http://blogs.technet.com/gray_knowlton/archive/2009/04/27/office-2007-service-pack-2-kiosk.aspx
For those of you who have been following the file format issue for a while, you'll recognize today's action by Microsoft as another significant step forward in enabling interoperability. This hopefully sends a signal (again) to our customers that we are committed (like all successful software businesses are) to addressing the needs of people who use our products by providing choice and interoperability. (Read more, Read even more)
If you missed the announcement, it roughly said the following:
Microsoft Office functionality will be updated to include ODF, PDF and XPS support:
Microsoft will contribute to the future evolution of the ODF and Open XML specifications.
Microsoft will contribute to the developments of other document format standards:
A lot of the folks who read the blogs are (rightfully) consumed with the "why?" question… or perhaps even the "Why now?" question. I'd like to take a little time to explain both. I think I can help add some context about why this decision was made, and why we think now is an appropriate time to take these steps.
There are really two central catalysts for these actions. One of these is the feedback we have received from the regulatory environment. There is a high degree of interest in our working with other software vendors to improve information exchange through the use of standardized technologies. In addition, we remain committed to promoting interoperability in our products which means creating the technology bridges necessary for the successful exchange of data with other solutions.
The second catalyst is how these advancements will help drive success in our business. Folks will offer theories across the spectrum about what Microsoft is "trying to do" or what these actions "mean." I'd like to offer a very simple rationale to explain why this is a net positive for our business, and to illustrate some of the thinking about our timing for the adoption of ODF.
Success in our industry (like a lot of other industries) boils down to successfully addressing the needs of customers. By offering greater choice for file formats, our products address more scenarios and provide greater flexibility in enabling specific solutions. From a pragmatic standpoint, adding ODF to Office allows us to re-focus Office on product capabilities rather than a debate about file formats. We're quite comfortable when we compete in the marketplace on these merits.
A natural follow-on question seeks to understand why we would bother with Open XML when we could have just supported ODF from the beginning and moved on…
(I'm oversimplifying a bit here, but) questions about compatibility and moving legacy content forward were very important to our customers, and we were already well down the road with XML-based formats that were designed to represent legacy content. Because ODF side-stepped the compatibility question, we were left to solve (continue solving) that challenge elsewhere; the aversion to dealing with legacy content created a real problem for customers who want to transition to more open file formats.
Speaking very plainly, business continuity is one of the most important drivers of software purchasing decisions. The goals of Open XML with regard to compatibility and preserving legacy content are things that we simply could not do without.
Those who have been involved with Microsoft Office for many years will remember the problems created when Microsoft essentially "flipped the switch" on a new document format for Office '97, offering little consideration for compatibility with existing applications. This had a negative effect on our business, and we were not keen to repeat the mistake. In many standards committee meetings, compatibility was regarded as a "factor," but in reality, the list of things that rank higher in importance is very short.
Open XML is a necessary, worthy standard; it is unique in its intent to address this problem. We will sustain our investment in Open XML through participation in ECMA and ISO, as well as in the development community. We are also committed to having a high-quality implementation of ODF. We accept the responsibility of driving toward interoperability with other products and platforms. We will work with supporters of Open XML, ODF, PDF and XPS to achieve interoperability.
Achieving meaningful, successful interoperability involves participation in the evolution of the standards as well as conducting public forums on real-world implementation issues. In our early testing we are observing that every product implementing these standards has some level of variation from the written spec. If you've been around standards for a while, you'll know this is common, and requires dialog to establish best practices & patterns. This is our reason for joining the OASIS, AIIM and ISO committees, as well as our motivation for hosting public forums like OpenXMLDeveloper.org to discuss our implementation of the formats. These are environments where we hope to learn as much as we contribute… we now get to the real work of enabling interoperability rather than theorizing about its potential in committees. I know the work will be challenging, but I am hopeful that this will ground the document format standards conversation in real-world implementation conversations, where we can uncover and resolve issues that make products share data with greater success.
Just some technical notes (and to tee up future posts) Office 2007 Service Pack 2 will incorporate support for ODF 1.1, to align to the other significant products and policies that support ODF today. SP2 will also support PDF 1.5, and the ISO standard PDF/A. These PDF versions are intended to maximize compatibility with the existing base of installed PDF viewing applications.
Office 14 will update our support for IS29500. The timing for this might seem strange, but I do hope the rationale is clear. ODF 1.1 is a completed specification. The final version of IS29500 is not published today. While we do support a significant portion of IS29500 already, the BRM changes and other issues raised in public forums will inform us on how to best move forward with IS29500… and it gives me a little time to address the compatibility considerations that will be an important part of any file format related changes in Office. ODF has a potential upside in expanding interoperability, but as always, business continuity requirements will have a significant effect on our approach to these file format changes. Our customers will accept nothing less…
Update: If you would like to sign up for the beta program for the tools, please email the following alias. mailto:OFAPPCPT@Microsoft.com
Update: Read more details about the tools in these two subsequent posts:
http://blogs.technet.com/gray_knowlton/archive/2009/11/10/office-2010-application-compatibility-deep-dive-on-the-code-compatibility-inspector.aspx
http://blogs.technet.com/gray_knowlton/archive/2009/11/02/office-2010-application-compatibility-deep-dive-on-environment-assessment-tool.aspx
Hello, my name is Michael Kiselman, I am a technical product manager driving Office 2010 application compatibility program on Office developer marketing team. I’d like to share our exciting news about application compatibility we’re unveiling today at the SharePoint Conference.
With the great value Office 2010 brings for end users, IT Professionals and Developers, we are also investing heavily in making deployment of the new version of Office easier. As part of our focus on deployment, we have renewed priority on helping ensure applications and Add-ins for existing installations of Office continue to work without hangs, crashed or performance degradation when interfacing with Office 2010.
IT departments charged with upgrading Office take special care to find the add-ins, macros and other 3d party applications users have installed to ensure they will not cause problems after the upgrade is complete. Developers (professional and non-professional dealing with macros and scripts in Office applications), on the other hand, spend time testing and migrating their code to work seamlessly in Office 2010. And then, there is a task of migrating Pre Office 2007 binary documents to the latest Open XML format based files.
Today we are announcing the Office 2010 Compatibility Program to help address these areas. The compatibility program will provide tools for environment assessment, code scanning and remediation assistance, and an update to the document conversion tools introduced with Office 2007. The tools, guidance and services we are delivering will be the most comprehensive we have provided to date for a new release of Office.
The Application Compatibility program will be delivered in the form of tools, guidance and programs.
Office Environment Assessment Tool (OEAT) and Code Compatibility Inspector are new tools that will be made available to assess the current state of desktop installations, and to scan code for potential issues. We will also update the Office Migration Planning Manager for Office 2010. Comprehensive guidance in a form of an Application Compatibility Analysis and remediation guide will be offered as well on TechNet and MSDN.
Figure 1: Office Environment Assessment Tool
We can share a little about the new tools we are building to give you an idea of where we’ll provide help.
Office Environment Assessment Tool:
· Discovers currently installed applications
· Discovers Add-ins currently in use by Office clients
· Discovers Programs that are not registered as Add-ins but still interact with Office programs
· Environmental assessment (potential upgrade issues)
· Add-in compatibility assessment – relates information about the program’s compatibility with Office 2010 from the TechNet site.
Code Compatibility Inspector:
· Scans Visual Basic for Applications (VBA) Solutions for potential issues
· Scans Visual Studio Office projects for potential issues
· Performs a simple text search (likely candidate search) for known properties and methods in the Office Object Model that changed
· Provides the option to comment/mark those areas in the code where text search has identified a possible OM match
· Summary of total lines of code scanned as well as total lines identified as potential candidates for OM changes
· A detailed report, with module name, line number, and links to remediation for each issue found with possibly a red/yellow flag for impact guidance
· Scans and optionally updates Declare statements for 64-bit compatibility
Figure 2: Inspecting VBA projects with the Code Compatibility Inspector
Want to get involved?
The beta of the tools and the draft of the Assessment and Remediation Guide will be available for customers and partners on Microsoft.com download center by early December. We will update this blog when they become available.
These tools and guidance will be available to our customers and partners through a variety of services like Desktop Deployment Planning Services for partners or a Deployment Optimization of Windows and Office MCS Offers. The tools and guidance will be available in virtually all of our deployment planning activities, look for them to land in a program near you.
Along with the tools, guidance and programs, we will also launch a partner program to provide an opportunity for Microsoft partners to pledge the compatibility of their products with Office 2010 and enlist the product on the upcoming Office 2010 Application Compatibility Center on TechNet. Some of you may have noticed the re-designed Office developer center on MSDN, we’ll continue to add to that with our compatibility activities.
Michael.
The first compatibility pack for Open XML was released in November of 2006. This add-in for Office XP and 2003 (which also works with Office 2000 in some cases) enables users to open, edit and save Open XML files using prior releases of Office. The compatibility pack is designed to ease the pain of introducing a new file format. As we learned in Office 97, changing file formats can create some significant deployment and compatibility challenges. It is a migration that we're handling with all due care and consideration for our customers' business continuity requirements.
The availability of the compatibility pack has been an interesting discussion. Today, the compatibility pack is only available as a manual download. In other words, Microsoft does not "push" the compatibility pack to users using its update tools. IT organizations or end users must manually download the tool, and deploy or install it themselves. Many organizations have (literally) demanded this be made available as an automatic update, while others would be dissatisfied with this, claiming that Microsoft is "forcing" Open XML onto its existing user community.
We decided to make it available as a manual download, and not as an automatic update, and during the first 12 months of its release, the compatibility pack has been successfully downloaded over 20 million times. This means that 20 million people have elected to manually download this 26.2MB software to their computer. This is a significant number of people adding Open XML to their environment.
Why do people download the compatibility pack? – to use Open XML, of course. If a user of Office 2003 or XP tries to read/edit an Open XML file type, Windows will offer the "Use a web service to find the appropriate program" dialog box to direct you to the compatibility pack download site. If you have updated Office with the latest service packs, you will get a similar (but more user-friendly) dialog box that directs you to the same place.
On the download center, users select their language, get the bits and off they go. The 20 million people who have already completed this demonstrate that Open XML is already in widespread use today, about 1 year after its formal introduction with Office 2007. This is in addition to the adoption Open XML is gaining in the broader software community: http://www.openxmlcommunity.org.
What is also interesting about the compatibility pack statistics is that they do not reflect deployment by IT organizations… It takes only one download by the IT desktop management team to prepare thousands of desktops with the compatibility pack (I have worked on a handful of these directly). The usage numbers for the compatibility pack are likely to be significantly higher than the download statistics indicate.
I won't explain in detail how these download numbers compare to things like the ODF Translator for Microsoft Office, but you can look at the download stats on SourceForge for that one and see for yourself. Being a product person (not a standards person) I'm far more interested in what users are doing with the software, so I don't have a positive or negative view of ODF (nor do I care to swordfight with the ODF community). But the statistics do speak pretty clearly about the preference of Microsoft Office users…
I believe in the marketing lexicon this is typically referred to as "rapid traction," but it does come with the responsibility of sustainability (speaking of buzzwords) and maintenance. Our commitment to the standard goes hand-in-hand with our long-term commitment to IT organizations and end users who have taken the opportunity to incorporate Open XML into their Office environment. Instead of the theoretical arguments and "what-if" scenarios that the document format standards community gets into, longevity of Open XML is a real consideration based only on the activity of people who use our products. In other words, Open XML is here to stay.
That's pretty exciting news.
Happy Holidays everybody.
I recall this Rob Weir post.
"Now we shouldn't be so careless as to say that there are only 2,000 OOXML document in existence, or for that matter only 160,000 ODF documents. Not all documents are posted on the web. In fact, most of them are sitting on hard drives, in mail files, behind corporate firewalls, etc. The documents that Google sees is only a sampling of real-world documents. But this is true of both ODF and OOXML. My hard drive is loaded with ODF documents that are not included in the above sampling. But however you spin it, the minuscule number of OOXML documents and their pathetic growth rate should be a cause of concern and distress for Microsoft."
Fast-forward to today, where I was just doing some checking on file format adoption:
Google File Type Search Results
Open XML
ODF
Open XML Document (DOCX) count: 94,000
ODF Document (ODT) count: 81,200
Open XML Spreadsheet (XLSX) count: 18,000
ODF Spreadsheet (ODS) count: 17,100
Open XML Presentation (PPTX) count: 32,800
ODF Presentation (ODP) count: 25,900
Aside from the numbers, the Google Trends graph really illustrates the story best (spreadsheets, presentations as well):
Indeed, Open XML has now passed ODF in terms of adoption (at least as much as this is a measure.) I'm assuming that if this measurement was good enough then, it's good enough now as well. I'm hoping we'll see Rob updating his chart soon.
But this isn't really what we're after… let's face it, the Open XML / ODF conversation has evolved far beyond this now. As Microsoft Office is working to support ODF and Open Office supports (at least partially) Open XML, the conversation is really about ensuring interoperability of both formats. Regardless of whether you favor Open XML or ODF, the path is really a means to making a dent in these numbers, or perhaps this one. Over time, we should expect to see a significant trend downward for these binary documents. But how long will it take?
Binary Formats
Binary DOC count: 44,600,000
Binary XLS count: 1,800,000
Binary PPT count: 5,990,000
At Microsoft we are now committed to helping solve the interoperability challenges required to make the meaningful dent in the binary formats. Like many others, we are sitting at the table to make a contribution to the discussion. We are moving in a positive direction, investing in Open XML, ODF, and in the interoperability conversation for real implementations.
Just like any other hard conversation, 90% of the winning formula is showing up with a constructive mindset and right intent. We're hopeful to continue the positive dialog with the ODF community and Open XML implementers alike.
Sorry for the absence, I took a short break from work and blogging after the birth of our second child. Being a parent is a great blessing. It's just the signing up for 12 more months of 3-hour increments of sleep that I'm not so sure about J.
But it's back to work for me now, and it is a pleasure to return to some great news related to the adoption of Open XML. The Compatibility Pack, software that allows you to open, edit and save Open XML format documents in Office XP and 2003 has now been downloaded over 100 million times. This is quite a strong indicator of the global adoption of the Open XML formats. This is incredibly positive news.
Why?
As I discussed when we were at the 20 million mark, the compatibility pack is a manual download. It is not pushed through any update channels*. In order for an end user to obtain it, they must visit the Microsoft download center, select one of the 35 available languages, and download the 26MB installer. To say it differently, more than 100 million people have had cause to seek out and download the compatibility pack for Open XML; likely due to their encountering a document stored in one of the formats.
This number also does not include IT departments who have pushed the compatibility pack to users through tools such as WSUS or other software management services. Typically that would have a download count of 1, and a distribution count of thousands. I have worked on several of those projects with various customers. The number also excludes our OEM partners who have elected to distribute the compatibility pack. Two months ago I purchased an HP Laptop which came with the compatibility pack pre-installed.
Also worth noting is the conservative nature of this measurement. The statistic measures known, completed downloads, but we're also aware that in many cases, the download completes successfully even if we don't receive the feedback that it has. It is very likely the case that the number of actual end user downloads greatly exceeds 100 million. We're also not counting the # of downloads of the free viewers for Word, Excel and PowerPoint 2007 either.
Combined with the outstanding traction of Office 2007 to date, we are now at a point where a substantial percentage of business productivity desktops are reading and writing Open XML documents.
This is also a good time to refresh this data. As of today, the gap between the number of indexed documents for Open XML and ODF is increasing. According to Google file type searches:
Format
Oct 08 result
June 09 result
% increase
DOCX
94,000
297,000
216%
ODT
81,200
132,000
63%
XLSX
18,000
86,200
379%
ODS
17,100
28,800
68%
PPTX
32,800
94,900
189%
ODP
25,900
46,900
81%
As I also said in my prior post on format adoption, however, relative to the 81 million binary Office documents indexed on Google, we have a long way to go. It's great to see that we're off to a great start on Open XML though.
*You can see from Microsoft Update that patches or updates to the compatibility pack are offered as automatic updates. The compatibility pack itself, however, is not available through any automatic update channels.
UPDATE: The links for the downloads for SP2 are in the process of replicating across mirrors for our WW download center. The download links are expected to be live at around 11:00AM PDT.
Update: The download links for SP2 are now live.
The Office development team has been pretty busy with the Service Pack 2 release (SP2). This is a monster Service Pack release for Office. The Sustaining Engineering blog for Office has quite a bit of the background, but I wanted to raise awareness on a few key aspects of the release, partially for the world-at-large, but more for the developer audience. Allow me to staple a handful of worthy links on my blog to get you started on the depth of data available about SP2.
Depending on your perspective, many things in this Service Pack are of significance. From a personal standpoint, the arrival of ODF 1.1 is something that I am very happy to see. I have also been pleased to see Microsoft step up when it comes to interoperability in the document format space through its publication of the ODF Implementers Notes in December, its publication of the Open XML implementers notes in January and its ongoing support for the Document Interoperability Initiative and a range of other activities. It has been 425 days since we posted our Interoperability principles, and it is great to see us sustaining that commitment and continuing to exceed expectations.
· Where can I download SP2? – You can pop up to Microsoft Update and install the bits
· Where can I learn about what is in SP2? – Here
· What files / DLL’s / exes have been changed? – learn about that here
· Is this an Automatic Update? – Not yet. For the first 90 days (at least the first 90), service packs are made available as a manual download. After 90 days and with a 30 day notice, Service Packs are offered through the Automatic Update channel as a critical update.
OpenDocument 1.1 (ODF) has been added as an available file format for saving documents in Word, Excel and PowerPoint. Doug Mahugh has covered this extensively in his blog, and the ODF 1.1 implementer’s notes have also been available for a while. We first announced our intent to add ODF to the list of supported file types over a year ago. It is great to see this activity come to fruition. I’m especially pleased / surprised at the level of engagement from folks in the ODF community, helping talk through some of the harder parts of the installation. In case you are observing the feature-level impact of saving to ODF in Office, you can visit the links below to learn more about how ODF in Office will behave.
· PowerPoint: http://office.microsoft.com/en-us/powerpoint/HA102877231033.aspx
· Excel: http://office.microsoft.com/en-us/excel/HA102877221033.aspx
· Word: http://office.microsoft.com/en-us/word/HA102835631033.aspx
Many have seen our announcement a few years back about the addition of PDF and XPS to the list of supported file types of Office 2007. This add-in was originally offered a free download for Office 2007, but SP2 has taken that a step forward and added the bits to the release – no longer a manual download. PDF export functionality will continue to support the creation of PDF 1.5 documents, as well as the ability to generate PDF/A – IS19005 compliant files.
Stephen Peront has an excellent post which illustrates how to use this new interface. The converter API is an extension of our strategy to support file format choice in our products. It enables solution developers to register a new file type for Office, so that it appears in the file type drop-down dialog box for saving documents next to the other 18 that you now get in the box. In a way this will help developers “future proof” Office desktops for new document format standards that may emerge.
An area of pain for users of Office charting has been addressed. A charting Object Model (OM) for Word and PowerPoint has been added to align with the charting support in Excel. Many customers expressed a need to programmatically insert, manipulate the size, and set the formatting of the charts similar to what was provided in the Office 2003 release. Potentially managing charts programmatically across the three core applications could save workforces thousands of hours of manual labor, depending on the level of complexity and content re-usage taking place. For more information on the changes to the Word and PowerPoint OM’s for Charting, look at this post from David Hale on the Office Developer Content blog.
Today Vista supports a feature / interface commonly referred to as Cryptographic Next Generation (CNG). Essentially this refers to a capability of Vista which (among other things) allows you to swap crypto providers without breaking your solutions, or perhaps to help future-proof Office 2007 installations about encryption algorithms that may emerge in the future. Office 2007 SP2 has been updated to support the same CNG functionality when installed on Vista. This provides the capability to swap crypto providers for Office documents. This was done in part to help people who desire to implement Suite-B encryption for Office documents. David LeBlanc has written an excellent post describing this addition in depth.
One of the most important end user benefits of the SP2 release is the improvement in Outlook performance. I have been dogfooding SP2 for over a month now, and I can attest personally to the life improvement that these changes bring J. I think if you asked the Outlook team, they’d be quick to tell you that SP2 is an update you should install as soon as you can.
Performance improvements that apply to the following general responsiveness areas:
- Startup: Removes lengthy operations from initial startup
- Shutdown: Makes Outlook exit predictably despite pending activities.
- Folder View and Switch: Improves view rendering and folder switching.
- Calendar improvements: Improves data structures and the reliability of calendar updates
- Data file checks: Greatly reduces the number of scenarios in which you receive the following error message when you start Outlook: “The data file ' file name ' was not closed properly. This file is being checked for problems. “
Traditionally, you cannot uninstall Microsoft Office service packs without completely uninstalling the Microsoft Office products. The new Microsoft Service Pack Uninstall Tool for the 2007 Microsoft Office suite (Oarpman.exe) lets you uninstall all the updates for the 2007 Office desktop products that are included in the 2007 Office suite SP2. The Service Pack uninstall tool will be available on the Microsoft Download Center as a free download.
You can use this tool to streamline the removal of all the client updates or individually
- Sample command line: “msiexec /i { MSI GUID } MSIPATCHREMOVE={ Patch GUID } /l*vx Path of the log file “
- SmartArt® Graphics & Charting
o Better rendering performance
o Better printing fidelity
o Fixes issues in the object model to achieve better parity with Office 2003
o Improves the Edit Points feature. This enables more accurate shape editing and increased interoperability with Office 2003
- Microsoft Office Access
o Lets you export reports to Microsoft Office Excel
o Fixes issues that occur in the import data wizards,
o Fixes issues in report printing and previewing
o Fixes issues in macros, in Excel integration, and in date filters
- Microsoft Office Groove
o Limits the number of file-sharing workspaces to 64 to make sure that all workspaces can be synchronized. This limit applies only to adding new file-sharing workspaces. If you already have more than 64 file-sharing workspaces, you can continue to use them.
- Microsoft Office Word
o Improves the fidelity of .pdf and .xps output
o Improves Outlook (Word editor) performance
Lots of folks will have lots to say about SP2 – they should; it’s a big release. Here are some links to great blogs that you can read. The table below contains a link to the KB articles & downloads for each product to be released.
Access Team Blog
Excel Team Blog
Word Team Blog
Groove Team Blog
InfoPath Team Blog
Visio Team Blog
Doug Mahugh
SharePoint Team Blog
Daniel Escapa's Blog
Outlook Team Blog
Project Team Blog
PowerPoint Team Blog
SharePoint Designer
SP2 for SharePoint
Office Client Products The 2007 Microsoft Office Suite Service Pack 2 953195 Microsoft Office Language Pack 2007 Service Pack 2 953195 Microsoft Office Project 2007 Service Pack 2 953326 Microsoft Office Project Language Pack 2007 Service Pack 2 953326 Microsoft Office SharePoint Designer 2007 Service Pack 2 953292 Microsoft Office SharePoint Designer Language Pack 2007 Service Pack 2 953292 Microsoft Office Visio 2007 Service Pack 2 953327 Microsoft Office Visio Language Pack 2007 Service Pack 2 953327 Microsoft Office Proofing Tools 2007 Service Pack 2 953328 Microsoft Office Access Runtime and Data Connectivity Components 2007 Service Pack 2 957262 Calendar Printing Assistant for Microsoft Office Outlook 2007 Service Pack 2 953329 Microsoft Office InterConnect 2007 Service Pack 2 953330 Microsoft Office Compatibility Pack Service Pack 2 953331 Excel Viewer 2007 Service Pack 2 953336 PowerPoint Viewer 2007 Service Pack 2 953332 Visio Viewer 2007 Service Pack 2 953335 Microsoft Office Language Interface Pack 2007 Service Pack 2 953339 Microsoft Service Pack Uninstall Tool for the 2007 Microsoft Office Suite 954914 Office server products The 2007 Microsoft Office servers Service Pack 2 953334 The 2007 Microsoft Office servers Service Pack 2, 64-bit edition 953334 The 2007 Microsoft Office servers Language Pack Service Pack 2 953334 The 2007 Microsoft Office servers Language Pack Service Pack 2, 64-bit edition 953334 Windows SharePoint Services 3.0 products Windows SharePoint Services 3.0 Service Pack 2 953338 Windows SharePoint Services 3.0 Service Pack 2, 64-bit edition 953338 Windows SharePoint Services 3.0 Language Pack Service Pack 2 953338 Windows SharePoint Services 3.0 Language Pack Service Pack 2, 64-bit edition 953338
Office Client Products
The 2007 Microsoft Office Suite Service Pack 2
953195
Microsoft Office Language Pack 2007 Service Pack 2
Microsoft Office Project 2007 Service Pack 2
953326
Microsoft Office Project Language Pack 2007 Service Pack 2
Microsoft Office SharePoint Designer 2007 Service Pack 2
953292
Microsoft Office SharePoint Designer Language Pack 2007 Service Pack 2
Microsoft Office Visio 2007 Service Pack 2
953327
Microsoft Office Visio Language Pack 2007 Service Pack 2
Microsoft Office Proofing Tools 2007 Service Pack 2
953328
Microsoft Office Access Runtime and Data Connectivity Components 2007 Service Pack 2
957262
Calendar Printing Assistant for Microsoft Office Outlook 2007 Service Pack 2
953329
Microsoft Office InterConnect 2007 Service Pack 2
953330
Microsoft Office Compatibility Pack Service Pack 2
953331
Excel Viewer 2007 Service Pack 2
953336
PowerPoint Viewer 2007 Service Pack 2
953332
Visio Viewer 2007 Service Pack 2
953335
Microsoft Office Language Interface Pack 2007 Service Pack 2
953339
Microsoft Service Pack Uninstall Tool for the 2007 Microsoft Office Suite
954914
Office server products
The 2007 Microsoft Office servers Service Pack 2
953334
The 2007 Microsoft Office servers Service Pack 2, 64-bit edition
The 2007 Microsoft Office servers Language Pack Service Pack 2
The 2007 Microsoft Office servers Language Pack Service Pack 2, 64-bit edition
Windows SharePoint Services 3.0 products
Windows SharePoint Services 3.0 Service Pack 2
953338
Windows SharePoint Services 3.0 Service Pack 2, 64-bit edition
Windows SharePoint Services 3.0 Language Pack Service Pack 2
Windows SharePoint Services 3.0 Language Pack Service Pack 2, 64-bit edition
I saw Bob Sutor's post last week titled "There is humor in the OOXML morass." This is where he calls out comments from Nick Tsilas as worthy of "a good laugh." I interpreted Bob's post as an attempt by IBM to deny that they are leading the charge against Open XML.
I like a good laugh too, so I had a look. I looked through the history of Rob Weir's blog, just to see what the chair of the OASIS Technical Committee is doing to drive the agenda for ODF, and to see if there was any sense of balance between the IBM Open XML agenda and the IBM ODF agenda. The only thing "funny" to me here is the revelation of the data behind the (Nick's) claim. So this blog is again an attempt to sort out what IBM is really saying / doing. Confusion reigns supreme.
I took somewhat of an informal look at Rob's blogging history, and it really does illustrate IBM's tactics on Open XML. I didn't use his tagging, I just read them and offered my own thoughts. It is quite clear that at least part of IBM is campaigning against Open XML. I'm not sure why Bob wants to hide from that.
By my count, Rob Weir has 134 blog posts in his archive.
94 of those posts have a central anti-Open XML and/or anti-Microsoft theme.24 of the posts are about ODF and related technology, momentum, OASIS news, etc.Many of the Open XML posts are anti-Open XML posts with significant ODF discussion Others have a focus that is dedicated to PDF, gardening or other things.
Even in the best possible light, we have a 3:1 slant toward opposing Open XML instead of touting ODF. This anti-Open XML stream originates from the co-chair of the OASIS ODF Technical Committee? Seems like Rob's attention is somewhat diverted. Having read most of the posts, some contradictions are apparent.
Rob Weir on one side
Rob Weir on the other side
Original Post
"Q: So, does IBM then oppose CDF in favor of ODF? (18 Nov 2007)A: No. IBM supports both the development of ODF and CDF and has a leadership role in both working groups. These are two good standards for two different things."
"That is the distortion you get if you look at a standards war through the narrow blinders of commercial interest. But if you look at the full market impact, the simple economics of it, it becomes a lot clearer. What brings greater efficiency, greater fidelity, greater innovation and lower costs? Having two incompatible document format standards? Or having a single harmonized document format standard? Fighting against economics is like fighting against gravity or the 2nd Law of Thermodynamics. You are going to lose in the end. The piemen of Erie, and their modern counterparts, are on the wrong side of economics, and history"
"Those who control the exchange format, can control interoperability and turn it on or off like a water faucet to meet their business objectives."
"I personally, as Co-Chair of the OASIS ODF TC, stand ready and willing to sponsor such a harmonization effort in OASIS. So let's start harmonization now, and avoid further divergence."
"Again, in order to support OOXML fully, and provide support for all those legacy documents, we need to divine the behavior of exactly how Word 6.x "inappropriately" placed footnotes. The "Standard" is no help in telling us how to do this. In fact it recommends that we don't even try. However, Microsoft continues to claim that the benefit of OOXML and the reason why it deserves ISO approval is that it is the only format that is 100% backwards compatible with the billions of legacy documents. But how can this be true if the specification merely enumerates compatibility attributes like this without defining them ? Does the specification really specify what it claims to specify?"
"…There are now and will continue to be multiple implementations of ODF and it is legitimate that they have application-defined features. These are stored as name/value pairs in a separate XML file in the ODF archive. I can think of no argument against that. Obviously no interoperability is expected for these vendor specific features, which are for things like application settings like window sizes, zoom factors, print settings, etc. In any case, ODF merely provides a place for applications to store these settings. To blame ODF for any vendor misuse of this feature is like blaming the W3C and HTML for non-standard extensions in Internet Explorer."
I have more to do than criticize Rob Weir, so I'll have to stop here. Between the data and the examples, though, the problem is clear. I really view this IBM-centric position as self-defense against reality. It seems like if the chair of the ODF TC wanted to improve adoption of ODF, then the blog would reflect that.
I wonder if the paranoia here is because of the contrast represented in the uptake of the Compatibility Pack vs. the ODF Translator for Word? The current count is 20 Million for Open XML vs. under 237,000 for the ODF Translator… or perhaps this list: http://www.openxmlcommunity.org/applications.aspx, which is A LOT longer than this one: http://opendocument.xml.org/products. (It's also worth noting that nearly every product mentioned on the OpenDocument site also supports Open XML.)
Facts are facts here. If you strip away the adjectives and adverbs, we can have a discussion about reality, based on data. Step 1, however, is acknowledging that out sound-biting each other won't get us to a situation where we can achieve interoperability with document formats. Effort and cooperation are required. Productive discussion should come to the fore. It's hard to sustain a relevant dialog with the industry when the signals from its constituents are mixed up like this.
I've included my swag at the Rob Weir posts.
Rob Weir Post
Anti OOXML
ODF focus
Anti ISO
Anti PDF
Other
The Case for Harmonization
X
What every engineer knows
Comedy tonight!
The Standards Trolls
You are Here
The Piemen of Erie
Legacy format FUD
Those who forget Santayana...
A Lick Back in Time (stamps)
The Right and Lawful Rood
Bait and Switch
662 resolutions, but only if you can find them
The Myth Of OOXML Adoption
PDF, The Waste Land, and Monica's Blue Dress
Document Format FUD: A Guide for the Perplexed
ODF enters the Semantic Web
Cracks in the Foundation
The biggest media launch of all time?
OpenOffice.org Conference 2007
Office 2007's Confusion Mode
How to Hack ISO
Pseudorandom Thoughts
The OOXML BRM
Disenfranchisement
Defective by Design
Is it safe?
The dog that didn't bark
The to the power of hype
The most recognized tune of all time
Two Feet, No Feathers
An Invitation: ODF Interoperability Workshop
One Year and One Hundred Posts Later...
My comments on the ETRM 4.0 draft
Competition Optional
Stranger than Fiction
The Cookbook
OOXML Fails to Gain Approval in US
The Formula for Failure
A File Format Timeline
The Value of Choice
No Representation Without Specification
Hemidemisemiquavers
Documents for the Long Term
The Legend of the Rat Farmer
Interoperability by Design
The Funnel and the Wedge
So where are all the OOXML documents?
Math markup marked down
Sometimes I need to remind myself
The Case for a Single Document Format: Part III
The first harvest of the season
The ODF Validation Service
The Case for a Single Document Format: Part II
Cannibalism
Pruning Raspberries
ODF Freely Available
The Case for a Single Document Format: Part I
Fast Track. Wrong Direction.
Document Migrations
Compatibility According to Humpty Dumpty
OASIS Symposium and OpenDocument Workshop
Essential and Accidental in Standards
Standards and Enablement
The Anatomy of Interoperability
Washing Machines are not Lamps
The Word Ends on May 1st, 2010
How Standards Bring Consumers Choice
Once More unto the Breach
Here today, gone tomorrow
Merely a flesh wound?
A Barleywine
Declaring Bankruptcy
Introducing ODF 1.1
More Matter with Less Art
Defining Deviancy Down
Microsoft on Standards
Adobe to Standardize PDF
A Review of the Wikipedia Article on ODF
Crocodile Tears
Linus's Law Applied to Standards Review
Document Format Punditry
The Parable of the Solipsistic Standard
Opportunity Knocks
Amusing but Confusing
The Vast Blue-Wing Conspiracy
A Foolish Inconsistency
Calling Captain Kirk
Guillaume Portes Redux
Surviving the Slashdot Effect
The Formats of Excel 2007
Broken Windows and the Ghost of Keynes
How to hire Guillaume Portes
A Brief History of Open
And then there were three...
Got ODF?
A notable achievement
How to Write a Standard (If you Must)
The worm in the apple
Some short notes
Beware of Geeks Bearing Gifts
Happy Thanksgiving
Genesis 11:5-9
Two simple questions
Unlocking the Wordhord
Ass-backwards Compatibility
The Chernobyl Design Pattern
Why is OOXML Slow?
The Celerity of Verbosity
A bit about the bit with the bits
When language goes on holiday
A Leap Back
Lingua franca, lingua exposita
In Dublin's Fair City
Nothing is certain but death and ...
ODF: Twenty Patterns of Use
Proposal for an Open Document Developers Kit (ODDK)
Fruits of the Season
Lyon Summary
The OOXML Compatibility Pack
A quick look at the 0.2 ODF Add-in for Word
Happy Labor Day
A Tale of Two Formats
The 96.97 percent problem
Four Shorts
A Demo: Mathematica, MathML and ODF
Math You Can't Use
Follow the Leader
Throwing stones at people in glass houses
Add-in finitum
Cum mortuis in lingua mortua
Site Updates
A game of Zendo
Lost in Translation
Traduttore, Traditore
Today is a very important day in the history of Microsoft; it's a moment with which I am very happy to be associated. If you didn't see the big news, you can read about it here. But if you just want to skip to the "What does this mean for Office and File Formats?" part, read on.
To recap what was announced, Microsoft is making changes to its technology and business practices in the area of interoperability. These advancements are designed to make our products more open and more available to the broader software community. There are four central principles involved:
This is consistent with a path we've been on with Office for quite some time. This is also reflected by many actions we've taken over the past 5 years. By "change," one can really argue that we're moving out of the slow lane into the fast lane on this topic, but we're still on the same road traveling in the same direction. A great example of the steps we've been taking is our announcement last week of broad, public availability of the Binary file format documentation.
To recap some of the ground we've already covered with respect to interoperability, here's a short list of background reading that is helpful: Interoperability Executive Customer (IEC) Council, Interoperability Vendor Alliance (IVA), CodePlex, Interoperability Home, OpenXMLDeveloper.org, OpenXMLCommunity.org, Accessibility, Interop Agreements with Novell, Xandros, Linspire, Turbolinux, IdeaAlliance & XML 2007 Conference, Virtualization, OpenID, JVC, Health Care, ADO.NET, the OSP, Microsoft Public License (MPL), Microsoft Reciprocal License (Ms-RL) and a host of others (hopefully the point is clear).
There's a specific aspect of this announcement that I wanted to highlight, because it is very relevant to the ODF discussion (as well as other file formats.) You may have seen this text in connection with the announcement:
"Enhancing Office 2007 to provide greater flexibility of document formats. To promote user choice among document formats, Microsoft will design new APIs for the Word, Excel and PowerPoint applications in Office 2007 to enable developers to plug in additional document formats and to enable users to set these formats as their default for saving documents."
For Office 2007, we will design and make available interfaces that can enable different file formats to be set as a default format in Word, Excel and PowerPoint. This addresses a key concern of organizations like BECTA who desire to have ODF or UOF or other formats enabled as a "default" for Microsoft Office applications. Should other file formats emerge in the future that are suited to word processing, spreadsheets or presentations, these new interfaces will enable their use with Office as well. In the past few years, we've seen a lot of momentum around document format support, and today's announcement will only improve the ability for others to integrate with our products.
It's important to note that the interoperability principles are unrelated to the current ISO standardization process for Open XML, and that process will proceed without regard to what is being announced today. But the objectives of Open XML now have an increased focus and sharpness. The purpose of the Ecma standard (and proposed ISO standard) is to represent content that exists in billions of binary documents, as well as delivering the type of business process integration enabled by the use of custom-defined schema. Open XML is unique in this regard.
We've said this before, but the goals of Open XML are distinctly different than ODF, PDF or UOF, and hopefully we can begin to separate the conversation about product functionality from the necessity for the Open XML standard. In our view, these have always been different conversations. The addition of these interfaces removes a potential obstacle to the adoption of other standards within our products.
The change should also underscore the idea that support for Open XML is not the same as opposition to other standards, despite many claims to this effect. Different formats are a means to achieving specific types of work, and interested communities exist to offer support for them. Today's commitment creates new opportunities to use many document formats in Office, and will allow people a greater ability to choose the formats that best suit their specific needs. This is good for our customers, but it's good for our business, too; adding these interfaces makes a lot of sense.
Microsoft has been and continues to be fully committed to opening its document formats for Word, Excel and PowerPoint. Interoperability is not new to Office, and Open XML is part of a much broader strategy around interoperability for Office. When we look at the past three years of document format related investments, you'll see this shining through; we've done quite a lot. Different circumstances led to each of these activities, but as a collection of work, the intent is unmistakable, and despite claims to the contrary, we're highly motivated to ensure that we can participate in an open environment. These are all steps toward openness, which is good for us, good for our customers and good for the industry.
Brian Jones has covered the history of the formats and XML support for Office in a prior post, there is a significant amount of ground covered in his post.
Let's take a look at what has happened:
So, no matter how you look at it, or which nits you'd want to pick (correct or not), this is a long list of advancements that really illustrate how much we have moved forward on openness and interoperability with respect to document formats in the Office products. Hopefully I don't have to write a grand conclusion, the data here should speak for itself. Let's just say that the commitment to openness here is evident and unquestionable.
I spend a lot of time working on the adoption of the Open XML Formats,
For IT organizations, it can be a daunting task to migrate document formats in Office, and it the benefits are not always immediately obvious. Microsoft spent a fair bit of time on tools / guidance to make the introduction of Open XML easier, and I'll drive deep on those in future posts. But I wanted to use this opportunity to discuss one of the primary reasons why you should let Open XML in, and how it can help. This will be the first in a 3 part series on file size reduction, document "sanitization" and improvements in document format security.
A tangible benefit of Open XML is file size reduction. Reducing file sizes means lower storage costs and reduced bandwidth consumption. Particularly for those paying for bandwidth on a meter, this can be quite helpful.
Why are Open XML Files smaller? With Open XML, and the Open Packaging Conventions, the file architecture is much more modular and is compressed using a ZIP archive. Storing XML content in a ZIP container lends itself very well to compression, so we do see great results for text-intensive documents like documents and spreadsheets. The benefits don’t translate as well for presentation files, because those tend to be image-intensive (and therefore do not benefit from ZIP compression), but even those are smaller.
The data in this post is a preview of a more comprehensive study we’re working on, but I thought I’d share some of the early returns. There’s no real magic in the study, it’s a pretty simple project. If you want to try this for yourself, you can do what we’re doing: use your favorite search engine / content store to retrieve 100 documents each for word processing (Word 97-2003), spreadsheet (Excel 97-2003) and presentation (PowerPoint 97-2003) format documents, and convert them to Open XML. Results will always vary slightly depending on your data set, but the results should be somewhat consistent with what we’re showing here.
You can do the document conversion using the desktop products, or the Office Migration Planning Manager (and the Office File Conversion tool, specifically), which has a command line interface. Other conversion tools are also available. Quality / results will vary depending on the translation environment.
This post will only discuss the Word documents converted using Word 2007, but the data will illustrate the survey results clearly.
"docx" Sizes
"doc" Sizes
Size Change
Storage Gain
Median
30Kb
69Kb
29Kb
52%
Minimum
11Kb
20Kb
-2Kb
-2%
Maximum
559Kb
975Kb
784Kb
87%
Percentiles
25
18Kb
35Kb
15Kb
40%
50
75
76Kb
160Kb
67Kb
62%
A median size reduction of 52% for documents is quite significant, and translates to real savings for disks and network traffic. We can assume a linear correlation between document size and the number of packets transmitted over a network; therefore we can assume a similar result in bandwidth consumption (bandwidth consumption data will be published in the final paper as well.)
Create a simple document in Word 2007. A great way to generate sample text in Word is by using a formula: “=rand(10,5)”, where 10 is the number of paragraphs in your document, and 5 is the number of sentences per paragraph. You can use this formula to generate documents of increasing length. In doing so, the benefit of compression in Open XML becomes instantly clear. I conducted this test 5 times, on documents ranging from 10 paragraphs of text to over 60 pages. (I have attached them here for you to use.)
I simply added the text, saved the file in binary format first, then saved the file again as Open XML. There is no formatting (beyond my default template, no tables, images or anything other than simple paragraphs.) As the documents increase in length, the benefit of compression is obvious:
Sample file name
.doc size
.docx size
Test 1
31k
11k
Test 2
86k
13k
Test 3
147k
15k
Test 4
269k
18k
Test 5
513k
26k
If you’re a graph type, we can make the relationship more clear:
This isn’t to say that 5,000 page documents stored using Open XML are going to be 1 – 2 % of their original size, but this is to point out that it is very easy to demonstrate real space savings with Open XML. Depending on the nature of the documents you are creating, especially if they are text-intensive, the size difference can be quite dramatic.
We’ll eventually publish the full data set in a more detailed (and scientific) white paper, and the paper will publish in late January. But as an introductory post, I thought I’d make this an easy one, with a pretty clear benefit. I’ll let you work out the math for your own storage & bandwidth savings, but if you can ask yourself “what would I gain if my files were half of their current size?” – I’ll bet the answer will usually be a good one.
Want to get an in-depth look at Office 2010 for Developers? Want to see what 64-bit Office looks like?
As you may have seen at PDC, TechEd or elsewhere, Office 2010 is on its way. To help you get ready, Office 2010 for Developers will be highlighted at the upcoming SharePoint Conference (October 2009, Las Vegas, NV) and TechEd conferences around the world in 2009 and 2010.
NET: Office Developer Conference will not take place this year; instead we are including the Office Developer Conference content within the SharePoint Conference. If you are an attendee of Office Developer Conference in the past, we strongly recommend you come see us at the SharePoint Conference in October, where we’ll cover Office client development in depth. Be sure to sign up for the Technical Preview as well!
We are optimizing our show presence for developers seeking opportunities to build on the Office platform, which includes Office client applications, SharePoint, Exchange and Communicator. By adding the ODC track to the 2009 SharePoint conference, we can provide better exposure to those seeking to develop solutions across the platform.
For more information on the SharePoint Conference contact spc@microsoft.com, and for the PASS Summit Unite conference, please contact marcella.mckeown@sqlpass.org.
I was pointed at a document created by the ODF alliance (with a creation date of Feb 6th) that discusses the recent dust-up on supported file types in Office 2003 Service Pack 3. While this was not one of our shining moments as a product, I was disappointed to see the amount of gross factual error in the report.
I'd like to spend some time illustrating how the report is wrong, and how the ODF Alliance ignored available data in preparing it. By doing so, of course, I hope to illustrate the level of credibility with which such claims should be regarded. All the documents / information I reference were available long before the publishing date of this material.
A Knowledge Base article discusses the actual changes made: http://support.microsoft.com/kb/938810
ODF Alliance Claim The Truth “When a user attempts to open one of these older files, they will receive the above dialog box and no alternative actions are given to help users to get access to their information in these “blocked” files. “When pressed for answers regarding this change, Microsoft eventually admitted that their action was in response to concerns with their parsing of Office 2003 code that presented a risk, but only after they suggested the move was in response to security concerns with the files themselves. Microsoft continues, in our view, to erroneously maintain that files in these formats are creating a “security risk.” http://blogs.msdn.com/david_leblanc/archive/2008/01/04/office-sp3-and-file-formats.aspx David LeBlanc, a senior developer in the Microsoft TWC group, explains the Microsoft position on the issue, which is that the parsing code for the formats is the general problem that requires resolution. This is consistent with other activities taken by Microsoft on other document format related vulnerabilities: http://support.microsoft.com/kb/935865 While it might be a neat trick for the ODF Alliance to caveat inaccuracy with “in our view,” nobody at Microsoft is claiming that the formats are the problem. If such a claim is made, it is against the guidance of the product support documented here http://support.microsoft.com/kb/938810 and here http://blogs.msdn.com/david_leblanc/archive/2008/01/04/office-sp3-and-file-formats.aspx For what it’s worth, nobody is discussing “Parsing of Office 2003 code”, what is being discussed is “document format parsing code in Office 2003.” It’s not clear if the author of the document was in a hurry when this was written, or if they just lack a basic understanding of the problem. “Really, what is at risk is Microsoft's ability to sell more products, namely their new Office 2007 which will lock users into their new file format, Office Open XML (OOXML), which despite its name, is not open. What is at risk is Microsoft's own coding errors.” The logic here is hard to follow: We have disabled legacy formats of Corel and Lotus. Because of this, you are unable to download the ODF Translator to Save as ODF, and you are unable to use the array of other formats that are supported (that include Open XML, HTML, TXT, RTF, CSV, and a lot of others.)? Really? Or are we adding a FREE update to a product people ALREADY have to give them a way OUT of the legacy formats? Even better, by disabling file formats created by our legacy products the ODF Alliance accuses Microsoft of locking people into our new products by encouraging people to use file formats that are more open than the previous formats. That list of formats includes Open XML, ODF AND UOF. In fact, the reality is exactly the opposite of what the ODF Alliance suggests. "Unless you need to work with these very old file types on a regular basis, it's probably not a good idea to keep these file types unblocked for long periods of time." The spokesperson, Microsoft Office product manager Reed Shaffner, fails to mention what should be done if one does need ongoing access to older documents.” The “Save As” feature works well as a remedy here. After unblocking the file types, one can use the current file formats of the Microsoft Office products, or ODF/UOF if one has installed the translator (or TXT, HTML or a number of others). For the Lotus and Corel formats involved here, one could always use the Lotus and Corel applications that created them. Again, it’s not clear how Lotus and Corel formats are a means for Microsoft to perpetuate the alleged “lock-in.” This would be perfectly congruent to a claim that says “OpenOffice locks you in because it has the ability to open and save Microsoft Office binary formats.” – the argument is equally absurd in both cases. “Just like Office 2003's lack of older file format support problems, OOXML compatibility with proprietary components like Windows Meta File (wmf) and Vector Markup Language (vml) have been deprecated because of “security concerns” that might prevent ISO approval next month.” This is also inaccurate. VML was deprecated for the reasons described here: http://blogs.msdn.com/brian_jones/archive/2007/12/21/500-national-body-comments-posting-today.aspx and have nothing to do with security. WMF was removed at the request of various national bodies who desired to have references to “Windows Metafiles” removed – to make the format more platform-neutral. I’ve spoken on security as well, and I am waiting to see the ODF Alliance recommendation for the removal of binary content from subsequent versions of ODF: http://blogs.technet.com/gray_knowlton/archive/2008/01/18/noooxml-says-down-with-xml.aspx “Exchange 2000 (in 2003) supported the WebDAV; today it does not. To use WebDAV now requires the additional purchase of Microsoft Office SharePoint Server.” This claim is odd (and false), because Exchange is a mail server and SharePoint Server is about portal management. SharePoint is not a mail server, and Exchange is not portal management software. This is a nonsense argument. The more important part of this discussion, however, is that Exchange 2007 does support WebDAV: https://blogs.technet.com/sayleong/archive/2007/03/25/features-de-emphasized-and-discontinued-in-exchange-server-2007.aspx, which gives WebDAV life in Exchange Server (including product support) until 2016, per the Microsoft support policy. Exchange has deprecated the functionality in favor of other technology, but this is different from “not supporting” Web DAV. “Office 2003 supported Reference Schemas that Microsoft offered to make public to governments requesting openness in file formats. Today, Office 2007 has obsoleted these schemas. Further, it is unknown what file formats Office 2009 - a product Microsoft is currently developing - will support” Wrong again. The XML Reference schemas of Office 2003 are supported in Office 2007. And, while the attempt at FUD relating to the future of Microsoft Office is interesting, let’s wait and see what Microsoft communication that the ODF Alliance will cite as the basis for the claim. “Recent events with Office 2003, and the examples above, should act as a cautionary tale of proprietary product, vendor, and platform lock-in. Although individual users are likely to experience frustration with compatibility issues seen in Office 2003 and OOXML, businesses and government agencies face a much more serious set of problems due to the ever increasing demands for document retention. Imaginethe expense that a government agency using OOXML will incur for the initial conversion of documentsand subsequent conversions that would be required whenever Microsoft is inclined to change their"standard".” From this I guess we should assume the ODF Alliance is saying two things: ISO26300 will be the only version of ODF that ever exists, and the cost of migrating documents to open formats isn’t worth the effort. We disagree on both fronts, as ODF already has three versions, PDF has many (1.0-1.7, plus PDF/X, PDF/A and others). The whole point of having open file formats is to encourage a migration to interoperable solutions. It’s not clear why the ODF Alliance argues against this. “The OpenDocument Format (ODF), an ISO standard, employs this type of design. Additionally, as ODF is supported by several vendors, platforms, and implementations, no user of ODF documents is at the mercy of any particular vendor.” Last I checked, this list: http://www.openxmlcommunity.org/applications.aspx is a lot longer than this list: http://opendocument.xml.org/products. And as we just saw in the prior argument, I guess we should take this to say that ODF 1.0 should be used, and ODF 1.1 and 1.2 should never be considered, given the (claimed) difficulty of migrating to an evolving standard.
ODF Alliance Claim
The Truth
“When a user attempts to open one of these older files, they will receive the above dialog box and no alternative actions are given to help users to get access to their information in these “blocked” files.
“When pressed for answers regarding this change, Microsoft eventually admitted that their action was in response to concerns with their parsing of Office 2003 code that presented a risk, but only after they suggested the move was in response to security concerns with the files themselves. Microsoft continues, in our view, to erroneously maintain that files in these formats are creating a “security risk.”
http://blogs.msdn.com/david_leblanc/archive/2008/01/04/office-sp3-and-file-formats.aspx
David LeBlanc, a senior developer in the Microsoft TWC group, explains the Microsoft position on the issue, which is that the parsing code for the formats is the general problem that requires resolution. This is consistent with other activities taken by Microsoft on other document format related vulnerabilities: http://support.microsoft.com/kb/935865
While it might be a neat trick for the ODF Alliance to caveat inaccuracy with “in our view,” nobody at Microsoft is claiming that the formats are the problem. If such a claim is made, it is against the guidance of the product support documented here http://support.microsoft.com/kb/938810 and here http://blogs.msdn.com/david_leblanc/archive/2008/01/04/office-sp3-and-file-formats.aspx
For what it’s worth, nobody is discussing “Parsing of Office 2003 code”, what is being discussed is “document format parsing code in Office 2003.” It’s not clear if the author of the document was in a hurry when this was written, or if they just lack a basic understanding of the problem.
“Really, what is at risk is Microsoft's ability to sell more products, namely their new Office 2007 which will lock users into their new file format, Office Open XML (OOXML), which despite its name, is not open. What is at risk is Microsoft's own coding errors.”
The logic here is hard to follow: We have disabled legacy formats of Corel and Lotus. Because of this, you are unable to download the ODF Translator to Save as ODF, and you are unable to use the array of other formats that are supported (that include Open XML, HTML, TXT, RTF, CSV, and a lot of others.)? Really? Or are we adding a FREE update to a product people ALREADY have to give them a way OUT of the legacy formats?
Even better, by disabling file formats created by our legacy products the ODF Alliance accuses Microsoft of locking people into our new products by encouraging people to use file formats that are more open than the previous formats. That list of formats includes Open XML, ODF AND UOF. In fact, the reality is exactly the opposite of what the ODF Alliance suggests.
"Unless you need to work with these very old file types on a regular basis, it's probably not a good idea to keep these file types unblocked for long periods of time." The spokesperson, Microsoft Office product manager Reed Shaffner, fails to mention what should be done if one does need ongoing access to older documents.”
The “Save As” feature works well as a remedy here. After unblocking the file types, one can use the current file formats of the Microsoft Office products, or ODF/UOF if one has installed the translator (or TXT, HTML or a number of others). For the Lotus and Corel formats involved here, one could always use the Lotus and Corel applications that created them.
Again, it’s not clear how Lotus and Corel formats are a means for Microsoft to perpetuate the alleged “lock-in.” This would be perfectly congruent to a claim that says “OpenOffice locks you in because it has the ability to open and save Microsoft Office binary formats.” – the argument is equally absurd in both cases.
“Just like Office 2003's lack of older file format support problems, OOXML compatibility with proprietary components like Windows Meta File (wmf) and Vector Markup Language (vml) have been deprecated because of “security concerns” that might prevent ISO approval next month.”
This is also inaccurate. VML was deprecated for the reasons described here: http://blogs.msdn.com/brian_jones/archive/2007/12/21/500-national-body-comments-posting-today.aspx and have nothing to do with security. WMF was removed at the request of various national bodies who desired to have references to “Windows Metafiles” removed – to make the format more platform-neutral.
I’ve spoken on security as well, and I am waiting to see the ODF Alliance recommendation for the removal of binary content from subsequent versions of ODF: http://blogs.technet.com/gray_knowlton/archive/2008/01/18/noooxml-says-down-with-xml.aspx
“Exchange 2000 (in 2003) supported the WebDAV; today it does not. To use WebDAV now requires the additional purchase of Microsoft Office SharePoint Server.”
This claim is odd (and false), because Exchange is a mail server and SharePoint Server is about portal management. SharePoint is not a mail server, and Exchange is not portal management software. This is a nonsense argument.
The more important part of this discussion, however, is that Exchange 2007 does support WebDAV: https://blogs.technet.com/sayleong/archive/2007/03/25/features-de-emphasized-and-discontinued-in-exchange-server-2007.aspx, which gives WebDAV life in Exchange Server (including product support) until 2016, per the Microsoft support policy. Exchange has deprecated the functionality in favor of other technology, but this is different from “not supporting” Web DAV.
“Office 2003 supported Reference Schemas that Microsoft offered to make public to governments requesting openness in file formats. Today, Office 2007 has obsoleted these schemas. Further, it is unknown what file formats Office 2009 - a product Microsoft is currently developing - will support”
Wrong again. The XML Reference schemas of Office 2003 are supported in Office 2007. And, while the attempt at FUD relating to the future of Microsoft Office is interesting, let’s wait and see what Microsoft communication that the ODF Alliance will cite as the basis for the claim.
“Recent events with Office 2003, and the examples above, should act as a cautionary tale of proprietary product, vendor, and platform lock-in. Although individual users are likely to experience frustration with compatibility issues seen in Office 2003 and OOXML, businesses and government agencies face a much more serious set of problems due to the ever increasing demands for document retention. Imaginethe expense that a government agency using OOXML will incur for the initial conversion of documentsand subsequent conversions that would be required whenever Microsoft is inclined to change their"standard".”
From this I guess we should assume the ODF Alliance is saying two things:
ISO26300 will be the only version of ODF that ever exists, and the cost of migrating documents to open formats isn’t worth the effort.
We disagree on both fronts, as ODF already has three versions, PDF has many (1.0-1.7, plus PDF/X, PDF/A and others). The whole point of having open file formats is to encourage a migration to interoperable solutions. It’s not clear why the ODF Alliance argues against this.
“The OpenDocument Format (ODF), an ISO standard, employs this type of design. Additionally, as ODF is supported by several vendors, platforms, and implementations, no user of ODF documents is at the mercy of any particular vendor.”
Last I checked, this list: http://www.openxmlcommunity.org/applications.aspx is a lot longer than this list: http://opendocument.xml.org/products.
And as we just saw in the prior argument, I guess we should take this to say that ODF 1.0 should be used, and ODF 1.1 and 1.2 should never be considered, given the (claimed) difficulty of migrating to an evolving standard.
Interesting post here: http://www.oooninja.com/2008/03/openofficeorg-30-new-features.html
"Microsoft Office 2007 file format support
Microsoft Office 2007 (also called Office Open XML) file formats include .docx, .pptx, and .xlsx. Despite the similarity in names, these formats are significantly different than the Microsoft Office formats used since 1997. OpenOffice.org 3 will offer native read and write support.
OpenOffice.org 3.0 DEV300_m3 converted this reference .docx document with mediocre quality. The notable problems were tracked changes, a comment, columns, an image, and an embedded Excel document. For comparison, the same document is shown rendered in Word 2007 and in OpenOffice.org 3.0 DEV300_m3."
The post references this link as well: http://katana.oooninja.com/w/odf-converter-integrator
Once again, those interested in interoperability benefit from adoption of the Open XML formats. I'll take this as a very strong statement of support for the Open XML IP Policy. I'm just wondering, is this what IBM is contributing to OpenOffice.org? Is this why they joined? :)
Onward!
Brian is in a unique position of being a TC-45 member and Microsoft employee, and his post really illustrates how much work has to go into the Ecma efforts around ISO standardization. I've been in the trenches with Brian on file formats for quite a while now; I've seen how hard the work is. I am always happy to applaud my fellow Buckeye fans, especially when they work so hard to carry the ball forward on Open XML and interoperability.
If you didn't see Brian's post, it's worth a read before you proceed here. In essence, it says that Microsoft will adjust the existing binary format program so that the documentation will be available directly from the Web, and offered under the Open Specification Promise. It goes on to say that Microsoft has committed to sponsoring a binary to Open XML conversion tool as an Open Source project. These developments are a response to national body comments on Open XML in the ISO/IEC standardization process.
It's important to recognize that binary format documents are important digital assets. This conversion project is important because it effectively makes the conversion of documents in binary formats to Open XML even simpler by providing a reference implementation that can be reused. It also provides more options for people to transition from binary to Open XML formats, with or without Office.
In addition, the OSS project will make it even easier for an array of products that currently support the binaries to transition to a more developer-friendly XML format. If you believe the OSS model, you'll agree that offering the source code for converting binary documents to open xml documents will hopefully stimulate a community of software products that will perform this valuable service. I think of the scores of content management software providers who implement the current binary formats, who are faced with a question of what to do about file formats… happy about Open XML because they get an easier file format to develop, questioning what the best way is to go about beginning a transition. Having a reference implementation will provide an easier starting place in the transition process.
Adobe, Sun and IBM already have received our binary file format specifications for Word, Excel and PowerPoint. Each of these companies currently ship products which support .DOC, .XLS, or .PPT. (For example, Adobe Acrobat includes a "Save as .DOC" feature, StarOffice, Notes, and other applications support the existing binaries). The OSS project should provide them with an additional mechanism to understand how to convert binary to Open XML (and subsequently translate between XML formats if they choose), but also handle the binary formats in other applications. Many, many other companies have also licensed the formats.
In the end, these announcements are really a "rising tide" for everyone interested in file formats. It benefits our partners, standards participants, competitors, and hopefully answers a lot of national body comments as well.
It is interesting to witness vocal minority who insist Open XML and ODF become the same format. It must cause them terrible heartburn to know that their recommendation comes against the wishes of the ODF Editor. And yes, Rob, this is Mr. Durusau speaking as "the editor of OpenDocument." So much for fine distinctions.
http://www.durusau.net/publications/wholoses.pdf
Mr. Patrick Durusau is again making his position on the issue clear:
"As the editor of OpenDocument, I want to promote OpenDocument, extol its features, urge the widest use of it as possible, none of which is accomplished by the anti-OpenXML position in ISO. Passage of OpenXML in ISO is going to benefit OpenDocument as much as anyone else. Here are some specifics: OpenDocument currently lacks formula definitions for spreadsheets. (To appear in OpenDocument 1.2.) Many core financial functions in spreadsheets are undefined except for actual Excel output. That output varies by version and service pack of MS Office. What happens if OpenDocument and OpenXML reach different definitions of those functions? OpenDocument does not presently support legacy features of Microsoft formats. That will be easier with a formal definition of those features. Without OpenXML, OpenDocument has no authoritative definition of those legacy features. That delays OpenDocument supporting them in some future release. OpenDocument does not have a robust mapping to the current Microsoft format. That requires an OpenXML that has completed the standards process. If OpenXML is unclear, it must be fixed in order to create a robust mapping between the two. The bottom line is that OpenDocument, among others, will lose if OpenXML loses. Covington, 24 March 2008 Patrick Durusau"
"As the editor of OpenDocument, I want to promote OpenDocument, extol its features, urge the widest use of it as possible, none of which is accomplished by the anti-OpenXML position in ISO. Passage of OpenXML in ISO is going to benefit OpenDocument as much as anyone else. Here are some specifics:
OpenDocument currently lacks formula definitions for spreadsheets. (To appear in OpenDocument 1.2.) Many core financial functions in spreadsheets are undefined except for actual Excel output. That output varies by version and service pack of MS Office. What happens if OpenDocument and OpenXML reach different definitions of those functions?
OpenDocument does not presently support legacy features of Microsoft formats. That will be easier with a formal definition of those features. Without OpenXML, OpenDocument has no authoritative definition of those legacy features. That delays OpenDocument supporting them in some future release.
OpenDocument does not have a robust mapping to the current Microsoft format. That requires an OpenXML that has completed the standards process. If OpenXML is unclear, it must be fixed in order to create a robust mapping between the two.
The bottom line is that OpenDocument, among others, will lose if OpenXML loses.
Covington, 24 March 2008
Patrick Durusau"
It's a small thing, but fun. Today I was given my 5-year service award for Microsoft. With a brief ceremony and a few jokes about how much more gray hair I have these days, I am now the proud owner of a new conversation piece. Just like Star Trek, I have a crystal that is seemingly capable of recording my memories and replaying them at will.
Unfortunately I don't have the glamorous history of so many people at Microsoft have. I do not have an expensive MBA, nor was I a catch in the college recruiting net. I do not wear the badge of having managed a failed startup / VC-backed thing. I was / am an "industry hire" with a surprisingly boring history of working on very successful products and services. I worked on Adobe Acrobat and PDF for a long period of time. I worked for a company named System, Integrators, Inc. Once a leader in the monolithic newspaper publishing automation space. I was a System Engineer there, working on Tandem hardware, installing systems, doing Y2K conversions, and writing routines in languages like TACL, RGEN, FGEN, FUP and a few other favorites.
I chose to work for Microsoft expecting to find great people building great products. I expected a highly competitive, smart work environment. I expected that the talent level at Microsoft would be the highest that I would have ever seen. In joining the Office team and one of the largest and most significant franchises in the brief history of software, I expected to find a caliber of leadership that exists in very few places.
I must say that my expectations have been exceeded in almost every instance. The IQ of each individual in this company is amazing. The per-capita talent level at Microsoft is something that one must witness in person to truly understand. It can work against you on occasion, when you have too many smart people asking hard questions. But on the whole, I will continue to bet on Microsoft long after I am gone because of the discipline the company has for finding, selecting, cultivating and utilizing talent.
Things I have worked on / with or Titles I have held:
- Sr. Product Manager, Microsoft Word- Sr. Product Manager, Microsoft InfoPath- Sr. Product Manager, Open XML- Group Product Manager, Office Technical Product Management- Group Product Manager, Office for IT Professionals- Group Product Manager, Office for Developers
Most of my role at Microsoft (despite the numerous titles) really revolves around doing more with data, and improving portability of information between our applications and other systems. Data portability is a concept that I learned and practiced building newspaper production automation systems (that were invented in the mid-1980's). For all the hubub about XML, SOA, Services-based computing, etc., this central idea really has not evolved much. Certainly the technology has changed a lot, but the end goal is still the same - enabling information exchange between systems and applications to enable better information exchange between people. (There's a famous saying at Microsoft about "Solving any problem through abstraction" - sorry for "abstracting" a bit here. In truth the problem is complex, nuanced and fiercely competitive among software vendors.)
I'll end the ramble here with a thought - for the readers from other software companies, a bit of a challenge to you.
Microsoft is great as an employer for many reasons, but one in particular that is worth highlighting is the competitive fire and spirit that burns in the core of so many individuals here.
I once worked for an employer who regarded competitors in a very unhealthy way. I won't quote some of the statements that were particularly offensive, to avoid internet searches that might reveal the identity of those people. But I'll just summarize it by saying that when issues / publicity about competitive products surfaced, the general mentality set by leadership (and therefore employees) was "how dare you?" As if they were viewing other companies who dared build similar products as a personal affront. Part of my reason for leaving the job was this mentality. competitors were innovating, and the company was responding by sending MBA's into business strategy reviews to "Fix" it.
Microsoft is quite different. A very important moment from early in my Microsoft career was when I watched Steve Ballmer on stage at a meeting talking about competition. I'll spare the context, but I have a very vivid impression of him standing there, rolling up the sleeves of his powder blue button-down shirt, saying "Bring it on." - Never was there a more concise encapsulation of the mentality of this company. It isn't about MBA's in a strat review for us - it is about engineers writing code and innovating. We focus on the product, and we are strongly committed to our customers and their success.
It is for this reason that I stay at Microsoft. We're not always perfect, there is definitely a "bleeding edge" to brining new products and technologies to market. As we stand on the front end of the Office 2010 launch, though, one can't help but have a very good feeling. When we get it right, the impact on how people interact with computers is profound and long-lasting.
Here's to hoping the next 5 years are as gratifying as the last.
Hello, this is Michael Kiselman again with exciting news related to the Office 2010 compatibility program and tools.
We have many participants using and testing the Application Compatibility tools for Office 2010. Thus far we have received much feedback from the community and are making improvements to the tools as a result of that feedback. As it goes with many pre-release programs, particularly ones which have no real precedent, the questions and feedback we are receiving is generating more questions and the need for even more feedback. As we near the launch of Office 2010, we want to double-down on our beta program for the tools and content. We're going to put some skin in the game to increase the rate of feedback we're getting on the materials.
Among the areas where we are seeking feedback are bugs in the OEAT and Code Compatibility Inspector Tools. We are going to conduct a public bug hunt, and we are going to offer prizes for those who can help us find defects in our tools.
For our bug hunters we are offering 2 prizes – XBOX360 Elite and 8GB Zune.
To win the prizes, all you need to do is to download beta of the Office Environment Assessment Tool (OEAT) and beta of the Office 2010 Code Compatibility Inspector, run them and report any bugs you discover to ofappcpt@microsoft.com before April 9th. If we are able to reproduce your bug or you can help us identify a defect in the tools, that becomes your sweepstakes entry. From the submitted and approved bugs, we will randomly select winners after April 9th.
We are looking forward to your feedback! Happy Hunting!
The Official Rules for the Sweepstakes are located here.
You should read these posts if you care to understand the technical facts of the matter:
- http://blogs.msdn.com/dmahugh/archive/2009/05/09/1-2-1.aspx
- http://blogs.msdn.com/dmahugh/archive/2009/05/05/odf-spreadsheet-interoperability.aspx
I’d like folks to see some of the commentary on the web by people who have been close to the discussion. While I may be on the end of a continuum with respect to my opinions of the conduct of the ODF TC chair (please notice I am speaking of the conduct and not the person), I am certainly not alone. There is A LOT of smoke here, some of it from ODF TC members past and days gone by.
ODF Editor, Patrick Durusau: “Every keystroke for a negative message about some other standard, corporation or process is a keystroke taken away from promotion of OpenDocument. If enough such keystrokes fall, so will OpenDocument. It's your choice.”
Rick Jelliffe: “A committee chairman has to be a mediator. That is their most basic function, along with organizer and promoter. A contentious and proudly partisan person is simply not suitable as a mediator, nor can someone who is paid to be a provocateur simply pretend they can be an effective mediator.
Also, standards committees usually feel it incumbent on themselves to have commercially neutral chairs. This is why academics and government people usually are appointed to these positions. The more that someone is involved commercially in the fray, the less appropriate and congenial it is for them to exercise authority in committees.”
Guy Creese: “I recommend that you read both blog posts, in that they highlight the complexities of coding to an ever-evolving open standard. However, look at the blog posts as an educational exercise--try to understand the arcane details, but don't get taken in by them. While the vendors would like you to believe that, "We're right--and they're wrong," the takeaway is the larger picture of, "ODF interoperability isn't here yet."”
Alex Brown: “So I believe Rob’s statement that “SP2's implementation of ODF spreadsheets does not, in fact, conform to the requirements of the ODF standard” is mistaken on this point. This might be his personal interpretation of the standard, but it is based on an ingenious reading (argued around the meaning of comma placement, and privileging certain statements over other), and should certainly give no grounds for complacency about the sufficiency of the ODF specification.”
Michael Hickins: “Microsoft’s acceptance of ODF would thus seem to be a victory for IBM, which makes Weir’s petulance puzzling. Government customers in particular have sought alternatives to Microsoft so as not to be in the position of subsidizing a private company (i.e., Microsoft) with public monies, and IBM has long coveted this market as an opening for its own suite of applications. IBM has also been trying to wean customers off Microsoft Office in the hopes of winning them over to its Workplace collaboration tool as an alternative to Microsoft’s SharePoint. But according to Sam Hiser, former executive director of the now-defunct Open Document Foundation, Microsoft has successfully called IBM’s bluff and forced Big Blue to show its losing hand.”
Marbux: “I am a former member of the OASIS ODF Technical Committee. I left two years ago because of that big vendor-dominated TC's obdurate refusal to get started on make the ODF Interoperability Myth that the big vendors spread come true.”
Gary Edwards: “So what we have in Rob Weir is this image of a goon who skates out onto the ice whenever IBM's opposition scores a goal. And anyone who interferes in any way with their business plans is the opposition. His job is to take them out by whatever means necessary. The thing is, the guy is wearing pink tights and spouting methinks and wherefore art thous. Before you know it, the bastard sneaks up on you and is clubbing you to death with lies.”
James Clark: “I really hope I'm missing something, because, frankly, I'm speechless. You cannot be serious. You have virtually zero interoperability for spreadsheet documents. OpenDocument has the potential to be extraodinarily valuable and important standard. I urge you not to throw away a huge part of that potential by leaving such a gaping hole in your specification.”
Tim Bray: “I learned, to my dismay, that the ODF specification is silent on spreadsheet formulas, they’re just strings. This is obviously a problem; much discussion on what to do ensued. I lean to the idea, much bally-hooed by Novell, of simply figuring out what Excel does, writing that down, and building it into ODF v.Next. Mind you, anyone who’s really been to the mat with Excel, in terms of Math & Macros, knows that it isn’t a pretty picture, there are real coherency problems. But it’s good enough and the world has learned how to make it work.”
And finally I’ll speak my piece on the matter. With a nod to Oliver Bell, Doug Mahugh, and many other comments on my post, I have no issue with the ODF TC, or even with the contribution that Rob may have made to the standard. My comment and complaint is very simple, and my point of view is one that the editor of ODF apparently shares:
Patrick Durusau: “Some members of the press have confused OpenDocument supporters with people who write for NOOXML blogs and websites, or that bash OpenXML, Microsoft, ISO, JTC 1, SC 34, etc. Those are not activities that support OpenDocument.”
So that you don’t miss it, the links in the quote are pointing to Rob Weir blog posts criticizing each of these entities. Rob’s response to the dust-up over SP2? (Which was apparently directed at nobody in particular)
Rob Weir: “I've been trying to respond to the many comments by anonymous FUDsters and Fanboys on various web sites where my post is being discussed. However, it is getting rather laborious swatting all the gnats. They obviously breed in stagnant waters, and there is an awful lot of that on the web.”
I rest my case.
I’m also (not surprised) disappointed at the tendency to opt for the sensational.
I found this headline pretty interesting. Just a note to Roy… had I ( “I” ≠ “Microsoft”) asked for “IBM” to leave the ODF TC, I would have addressed all 14 IBM employees currently listed as ODF TC members. I didn’t do that. In my post, I did not identify IBM until I replied to Rob in a comment, and at no point did I speak of the role of the IBM Corporation in my post. I addressed only one person who works for IBM, and this is because he is in the role committee co-Chair, and he has a history of criticizing Microsoft, ISO, JTC-1, Open XML, SC 34, Gary Edwards, Rick Jelliffe or virtually anyone else who dares to disagree. And FWIW, that photo used in the graphic isn’t me.
One last note on assigning my perspectives to Microsoft. Visit this post to see another Microsoft employee who sees things a little differently than I do. http://osrin.net/2009/05/back-and-forth-back-and-forth-odf-11-ods-and-interoperability/. My opinions are my opinions, just like so many other folks involved in this discussion.
As for my new “Friendo,” well, I don’t think I have a lot to say about the post. This one gives off more heat than light, it really doesn’t offer much. But there are some assertions being made that are worth correcting/addressing:
First (unfortunate that we have to keep covering this ground), war metaphors really are not appropriate for this conversation. We’re not discussing human rights violations; we’re discussing matters of software and industry. Let’s keep this in the proper perspective.
How does this stack up against TC-45? – well, if you can find a member of TC-45 conducting a blog whose apparent purpose is to criticize Open XML implementers, we’ll talk. I’m pretty sure none of those exist.
Regarding “Supporters,” someone has already covered that ground.
On this…
“I would be really, really pleased to see a top-notch quality support of ODF inside Microsoft Office. Why? Because this would be fair and unbiased competition based on one true Open Standard. It would a give a real level-playing field, where products could compete on sole merit and not on twisted situations of users’ lock-in. So trust me Gray: the world has everything to win from our competition.”
we are in [at least partial] agreement. I am very happy that we have added the Save as ODF and PDF functionality to SP2. I am glad that we are able to offer the choice of formats to users. Unfortunately Rob’s tactic here is to isolate SP2 based on one feature of one application for reasons that are being rejected in other forums. And as far as “one” standard goes, this is the part I don't agree with. Paving the entire world with a single document format doesn’t seem wise to me, and I’d rather be in the position of supporting the standards that people choose to use, rather than forcing people to use the one my product supports.
“I understand Gray. Gray is the Product Manager of Microsoft Office at Microsoft. Which means he is ultimately to blame for the lousy job Microsoft engineers have done in implementing ODF inside Microsoft Office. Gray is in the front line, and you can bet he’s having to answer some tough calls from customers right now. Gray does not have to ride the smooth « try Seven after Vista » wave; he has to go through the clutter that Microsoft’s big heads have created by thinking: What if we had ODF wrecked inside Office and get the world to believe that it’s not our fault? That’s Gray’s problem. And this is how we come to the waterboarding of Rob. But I digress.”
While the title “Product Manager” at many software companies includes responsibilities of spec writing, bug reviews, design meetings, etc., at Microsoft (at least in the Office group) it does not. I have held that type of role at other companies, but here, my focus is on enabling Developers to get more out of our software. I did not write the specs, I did not attend any design meetings, etc. The assertions made in the post are inaccurate.
I am involved because I have been working on various aspects of Open XML and document format standards in Office for almost 5 years now, and when added to prior experience in dealing with products’ implementations of document format standards, I’ve been at this for about 8 years or so. I know the neighborhood pretty well.
My ongoing disappointment with this discussion is the inability for people to apply the same principles for which Open XML was so heartily criticized (application dependence, under-specified or missing features, etc.,) to ODF implementers. I am hopeful that the chair of the ODF TC can focus his energy on solving those challenges, rather than trying to isolate Microsoft through subjective criteria. If he can’t, then he should just step aside.
As I have stated on my blog earlier, I WANT a good ODF implementation in Office to improve the satisfaction level of those interested in interoperability. I am very much rooting for a positive outcome to this discussion. We have a commitment to doing a high quality job just as we do with other aspects of our products. We will certainly focus on the demands of our customers for quality and interoperability and will continue to engage with other vendors in the years to come to (a) improve the spec and (b) improve the Interop between implementations of the spec.
To date, I have not fielded a call or request from a developer seeking to build a solution in Office with ODF. By contrast I see many requests of developers who wish to build Office solutions that include PDF. (Open XML is quite healthy as well, but I’ll leave that part out for a bit so as to not compare the two formats.) I assign no positive or negative value in the level ODF adoption that I see when dealing with developers; if they use it, our product can write the format; if ODF is a means to improving their solution, then I will gladly provide my best effort toward ensuring that Office-based solution is top quality. Either way I am hopeful that our product can be successful in supporting whatever use people seek to achieve with it.
So today we pull back the curtain on Office 2010. Product managers know that launch is the most exciting (and exhausting) time. Late nights building demos on "not release candidate" builds, refining the storytelling, making sure we're not missing any key new features, and so on. The last mile of product communication is quite difficult, requires a lot of dotting and crossing. Having watched this version of Office evolve from concept to reality has been quite a journey, filled with fascinating discussions large and small. (Remind me to blog in the future about arguments on what an acceptable download size is, and why fonts matter so much.)
This is my second launch for Office, and the 14th (or so, I've lost count after 10 years) product launch I've been involved with in my career. What is unique about the build-up to 2010 for me is the anticipation of something so new and innovative, and the expectation for Microsoft to deliver something great in this release. Now that I've been using Office 2010 for a while, I am confident that the early Tech Preview testers of the product will find a lot in there, as will the folks testing the broader public beta down the road.
I wanted to take a minute to point you to some of the great resources you can use to learn more about Office 2010.
I'll have a lot more to share on this blog as we move forward with 2010. Some of the topics that you will find here:
Office 2010 is a groundbreaking release, and with it folks will be reminded why Microsoft Office has been the leading innovator in business productivity software for 20 years.
In case you missed it yesterday, DAISY Consortium announced the release of the second version of the DAISY Translator for Word.
I've said it a few times on my blog, but I did want to say again how much we value our partnership with the DAISY Consortium, and our gratitude towards them for their help on this work. Microsoft Office is the leader in providing accessibility support in business productivity software; we are committed to continuous improvement of our support for users with disabilities.
From the press release: "Michael Hingson, the President of The Hingson Group, believes that the "DAISY navigation system is one of the most significant developments to be made available since the development of Braille. DAISY allows people who are blind to move around recorded and electronic documents easily and seamlessly in a way so far only available to sighted readers."
The big news with the 2.0 release is the addition of Full DAISY Text and Audio books. Instead of converting to a DAISY XML file, you can now effectively save your Word documents as MP3 files. DAISY XML files can be read natively by some DAISY players, and the DAISY Pipeline is still available for processing those XML files. For more technical readers, this means that Version 2.0 of the translator incorporates the "Lite" version of the DAISY Pipeline, and generates full text and audio books using the Text-to-Speech service on your PC.
But the change to the 2.0 release of the Translator for Word represents a monumental simplification of this process. This is a fantastic development.
Figure 1: Screen shot of the Word Save As DAISY 2.0 Dialog Box
As part of this activity, a new web-based player is available for DAISY Talking books. Buttercup has been developed through a partnership between DAISY, Microsoft and New Zealand's Intergen to allow people to listen to & navigate DTB's through a browser via a Silverlight control.
I have also attached a full-audio full-text book of this post. This book was created only by using the translator. You can download this ZIP file, and open it with the Buttercup player. No other software than Word 2007, Vista (the TTS engine) and the translator are required to create this talking book.
If you are interested in learning more about DAISY Talking book formats, a good description has been added to the DAISY.org forums.
I would like to thank the DAISY Consortium, George Kerscher, Intergen Software, Sonata Software, and the Adaptive Technology experts from the Royal New Zealand Foundation for their support on the project.
TextGlow, the prototype Open XML Viewer for Silverlight is one of the most popular posts in the brief history of my blog. Intergen, a Microsoft partner in New Zealand, have now elected to offer the source code for this application on OpenXMLDeveloper.org. I am very excited about this development, Silverlight is amazing, and the combination of Open XML and Siverlight easy for a nerd like me to get excited about.
If you haven't signed up for OpenXMLDeveloper.org, now is a good time to do so. You'll find this source code as well as other projects to help your development of Open XML solutions.
Here is an excerpt from the article:
Displaying Open XML documents in Silverlight with TextGlow By James Newton-King
The Office Open XML file format has opened up a new range of possibilities to working with documents. The Microsoft Office 2007 suite of products replaces the old binary formats and produces documents in a parser friendly XML format by default instead.
TextGlow (www.textglow.net) is a Silverlight 2.0 application that leverages the new Office Open XML file format to display Word documents directly in the browser. This article will look at how to use the built-in features of Silverlight 2.0 to read content from an OpenXML package, parse XML using LINQ to XML and display the document contents using Silverlight.
I was recently pointed to a presentation about Open XML that raised my curiosity. It found its way to me because it included my picture, but the content is what's on my mind. I take the Open XML discussion pretty seriously; I've had very interesting and stimulating discussions about Open XML with a lot of folks, but I've also seen a lot of the nonsense that makes the discussion cloudy and difficult (see below).
The slide references a comment I made in a ZDNet Australia interview in reference to advantages of XML-based document formats over binary formats for enabling better security. Specifically, having file formats represented in XML makes parsing simpler, because XML documents are expressed using a pre-defined (in this case public) schema. They can be easier to parse than binary formats, which can be opaque and obscure, even when you already have its documentation. Given a choice, I'm sure that 99/100 developers would prefer to work with an XML-based format over a binary format, if only for the sake of simplicity, and my comment here illustrates one of those reasons.
The deck goes onto state that Open XML allows "arbitrary binary blobs of data", citing this as a "security hole" (this isn't really anything new; this has been rehashed on several forums). I'll just take a guess and say the presenter probably missed a few important references about ODF (search for "Binary" in the text), or within the ODF spec itself… Section 9.3 of the ODF specification discusses how frames can contain "Objects represented either in the OpenDocument Format or in a Object Specific Binary Format." Section 9.3.5, describes the ability to add "plug-ins" to documents for "a media type that is not usually handled natively by office application software." Base64Binary is a core data type of ODF, as described in section 16.1.
Of course both Open XML and ODF allow the embedding of binary content. So I guess it's not clear to me why we're picking on the binary DevMode structure when (so-called "arbitrary") Binary data is supported in both formats (and probably every other authoring file format that is in widespread use today). If the implication is that ODF doesn't allow the inclusion of "arbitrary" binary information the implication is absurd and false. By this logic I'd guess it's worth a question to OASIS if we should expect binary data to be removed from a future version of the ODF spec? – I know the answer to that question; it's not even worth asking.
I haven't heard the deck presented, nor do I plan to tear the rest of it down (might be fun for a rainy day), but it looks to me like whoever created this slide deck is attempting to criticize a fundamental purpose of XML. Or maybe this is a criticism of the entire list of XML-based format specifications. Nothing about this criticism is specific to Open XML… it is an indictment of XML and document formats.
It seems odd to pick a fight with yourself (… very Fight Club-ish… "I am Jack's Self-deprecating Argument"…);
The discussion about parsing XML formats vs. binary formats is equally applicable to Open XML, ODF, UOF, CDF, or (pick your XML-based format of the day). These slides contribute nothing to the XML formats discussion other than confusion. Part of the reason that the XML Formats debate exists is because (I think) we at least agree that XML offers us better opportunities for document format management than a binary format would… but according to the their point of view, I seem to be mistaken on that point. I must also be seeing things, because when I read the ODF spec, I see a lot of "arbitrary" binary data types in there too… obviously I've missed something.
Silly me J.