web metrics
What is "Custom XML?" ... and the impact of the i4i judgment on Word - Gray Matter - Site Home - TechNet Blogs

Gray Matter

Gray Knowlton's blog on Microsoft Office

What is "Custom XML?" ... and the impact of the i4i judgment on Word

What is "Custom XML?" ... and the impact of the i4i judgment on Word

  • Comments 56
  • Likes

I recall saying recently that "this is my last post for 2009." Whoops... I don't think I was anticipating this. I watched with interest yesterday the coverage and reaction to the i4i judgment. I am not keen to share my own thoughts about the case here, but I would like to offer clarity around the specific area of Word in question, and suggestions for what people can do about it if they are using that functionality today. There is much confusion about the part of Word that is actually affected.

First, some things to understand:

We do not anticipate any interruption in the availability of Word or Office 2007. Additionally this ruling has no impact on the scheduled availability of the 2010 Office version which is planned for the first half of CY2010.

Current users are not affected. If you are using the custom XML tags in Word 2003 or 2007 (these show up in Word as Pink Tags around tagged content), you are free to continue doing so with the products you have already purchased.

Open XML standards (all ECMA and ISO versions) are not affected. Even if Word's specific implementation of custom XML support does infringe the i4i patent (which Microsoft does not believe to be the case), i4i has never claimed that its patent is essential to the OXML standard.

Content Controls of Word (screen shot below) are not affected. In Word 2007 and Word 2010, this is a common method of binding document content to data stored in a custom-defined schema within a document.

image

The functionality that is in question is indicated by the screen shot below. Custom XML Tags in Word documents are visible in the Word user interface as Pink Tags surrounding tagged content in a document.

image

What you can do if you have questions about your solutions that use Custom XML Tags:

First, download the Office 2010 beta and test your solution. If your solution works in Office 2010, it does not depend on the functionality in question. If your solution does utilize Custom XML Tags, consider re-implementing the solution using Content Controls. Detailed guidance on the use of Content Controls in Word 2007 can be found here. Also note the Word Content Controls Toolkit on CodePlex. The Open XML SDK, of course, is quite useful for getting people up to speed on developing solutions for Word and Open XML.

Update: Additional Detail

In response to several inquiries on the topic, I have included additional text describing the feature area that is affected vs. what is not affected, including links to KB articles which illustrate the capabilities in more detail. 

Affected:

Word 2003 and Word 2007 distributed prior to 1/11/2010 can read files that contain XML markup (ref: “Understanding Word's XML Markup [Word 2003 XML Reference]”, http://msdn.microsoft.com/en-us/library/aa212889(office.11).aspx. When custom XML markup is present, Word delineates this content in a Word document which allows it to later save the file to .DOCX, .DOCM, or .XML with that content marked up.

The Word 2007 product distributed by Microsoft after 1/10/2010 will no longer read the Custom XML markup contained within .DOCX, .DOCM, or .XML files.  These files will continue to open, but the Custom XML markup tags will be removed. Custom XML markup stored within .DOC files will not be affected by these changes.  Word 2003 and existing installations of Word 2007 will not be affected by this change.

 Not Affected:

Word 2007 also added features allowing Content Controls to map to XML data stored in a DOCX or DOCM file (ref: “Mapping Word 2007 Content Controls to Custom XML Using the XML Mapping Object”, http://msdn.microsoft.com/en-us/library/bb510135.aspx). Content Controls and XML data stored within DOCX or DOCM files will not be affected by this change. 

 

 

Comments
  • does this mean that we can still use our solution around content control mapped to custom XML within the document. Only change will be we cannot visualize the mapping inside the Word application????

  • Shiv,

    Yes,  you can continue using your solution based on content controls mapped to custom XML. And you can expect to continue seeing the content control mapping inside the Word document.  Content controls are not affected by this.

  • Hi,

    I have reviewed the blog and still have few questions:

    a. For ex: if my document contains an attached XML schema, will that be affected? if yes, then what uses will be affected?

    b. My solution which is based on content control and Custom XML  & mapping will continue to work?

    c. will that affect the Word 2003 XML Schema feature as well? (using Office Compatibility pack)

  • Great blog entry - you're the first to address the implications of the i4i ruling on vsto development, and I was wondering the same thing.

    I did a quick test on office 2010 beta release, and was able to bind custom xml data to the content controls with no problems.  So far so good...

  • Ankish, I hope my addition to the post will answer your questions.

  • Hi Gray,

    From what I gather, the .DOC format with attached Custom XML is unaffected. So the thing to do is use .DOC format for this type of solution. Correct?

  • Thanks Gray for this post. That last thing we want (and our clients/partners need) is more confusion, and you do a great job of defusing it.

    Francis Dion

    CEO, Xpertdoc Technologies Inc.

    http://francisdion.blogs.com/software_process/

  • "Custom XML markup stored within .DOC files will not be affected by these changes."

    - Does this mean I can still use 2007/2010-versions of Word to insert custom XML tags into .doc documents?

    (My open source document generater depends heavily on custom XML tags: http://flexdoc.codeplex.com, so I'm very very disappointed this feature is being dropped!)

  • What does ruling mean for custom Ribbon development in MS Word - I distribute a Ribbon to my customers - Ribbons have an XML component.  

    From the sounds of what I have read, look like custom Ribbon addins will be unaffected; and the will mostly large doc management projects.  

    Could this be confirmed please?

    Thanks

  • Thank you for explaining it. Now I have another question: How do you tell if the Custom XML capability has been removed from the version of Word installed? File dates? File version numbers? Error messages?

  • Thanks Gray, that post was just the clarification I was looking for! It probably doesn't help that document properties are stored in 'Custom.xml' in a docx file... :)

  • Hi Gray,

            first of all thanks for your post which will save me a lot of wasted time.

            I have an old project (template+VBA) written in Word XP in which I programmatically swap portions of the document and attach to them attribute hidden to the user. Of course I couldn't use XML, so I achived my goals using Hidden Text and Paragraphs Styles. My solutions was not very sound so I'm in the progress of rewriting it for Word 2007 and 2010 using  Custom XML Markup.

           I have written MyXMLSchema.xsd with elements and attributes and tag programmatically portions of the document.

           But it seems that I'm exactly in the situation you are describing. When I open a tagged document in Word 2010 Beta the Custom XML (pink tags) doesn't show. I'm still able tag a document in Word 2010 Beta but after I save the file and reopen it nothing is there and I have to go on retagging everyting, altought the Custom XML is still there in the document.xml of the Package.

          So I think I have to use, as you suggest, Content Controls, which is probably even better because I can avoid the user deleting the tags/controls.

         But I have some problems:

    1)  How could I attach Attributes to Content Controls?

    2)  I don't need (and want) mapping  my Rich Text Controls on a XML Custom Part and I'm afraid of making needless bigger my files.

         BTW, Do you think I'm on the right track or could I use a different approach?

        Thanks in advance, Lauro

  • I don't understand this at all.  The i4i patent relates to a separation of metacodes and content doesn't it? Where is the separation, if it is the pink tags only?  

    If this relates to concrete internal implementation issues only, how did i4i know without access to the source code? (Was the source code examined in the trial?)

    And what kind of addressing was actually used?  A kind of XPath? A kind of tumbler? A kind of ID? A kind of numeric offset?

  • Two questions:

    1) My organization has over 10,000 employees with more than 25,000,000 office files in active, spinning storage.  Does Microsoft have any tool that we can use for discovering if any of these Office files contain the impacted content?

    2) We (of course) have a pre 1/11/2010 version of Word installed now.  Will the removal of this functionality come in some form of a service pack, security patch or Office Update, or will this change in Word's behavour only happen if we explicitly re-install Word from a post 1/10/2010 copy of Word?

  • I'm currently working on a system (in PHP) to manipulate Word documents as part of a DMS.

    This system replaces marked text with the Content Controls. However, Content Controls do not offer functionality to do similar things with tables.

    As a semi-nasty workaround I'm currently using the w:customXml tags (and w:element attribute) to tag server generated tables so I can keep them updated when a user has re-uploaded the document.

    I tried the new Office 2010 beta and found out that it removes these tags (as expected) when a user saves the document so I can no longer tag tables like this.

    So my question;

    Is there another tag like the w:customXml tags which I can (ab)use to tag my generated tables?

    I just need a way to tag a table which is preserved after a document is saved, I don't particulary care if the solution is 'clean' as long as it's likely to work with future versions of Word.

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment