<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.technet.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>E-Discovery and Microsoft Technology : Categorizing data</title><link>http://blogs.technet.com/ediscovery/archive/tags/Categorizing+data/default.aspx</link><description>Tags: Categorizing data</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>Tag-inator 3: Rise of the Machines</title><link>http://blogs.technet.com/ediscovery/archive/2009/03/27/tag-inator-3-rise-of-the-machines.aspx</link><pubDate>Fri, 27 Mar 2009 22:18:18 GMT</pubDate><guid isPermaLink="false">d5e57398-b9ef-4490-9955-07cbb4e4a80d:3219195</guid><dc:creator>chris.chalmers</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.technet.com/ediscovery/comments/3219195.aspx</comments><wfw:commentRss>http://blogs.technet.com/ediscovery/commentrss.aspx?PostID=3219195</wfw:commentRss><description>&lt;h3&gt;Can a machine categorize or tag email any better than a person? &lt;/h3&gt;  &lt;p&gt;In our previous post, we explored how to make it easy for users to categorize an email inside of Outlook. But what if the user still isn't doing it, in spite of how easy we've made it? Or what if the user has good intentions, but makes an honest mistake? Is there any way to automate this? &lt;/p&gt;  &lt;p&gt;Yes we can automate this, with Exchange 2007 Transport Rules. Does it do a &amp;quot;better&amp;quot; job than a human? Let's take a look.&lt;/p&gt;  &lt;p&gt;First off, here's a nice summary of the &amp;quot;Transport Rules&amp;quot; feature on the MS Exchange Team blog. It's useful for many things besides message classification: &lt;a href="http://msexchangeteam.com/archive/2006/12/12/431879.aspx"&gt;http://msexchangeteam.com/archive/2006/12/12/431879.aspx&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;Next, let's take a sample email, and see how a transport rule would work. &lt;/p&gt;  &lt;p&gt;&lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New"&gt;FROM: Legal Department&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New"&gt;TO: Executive Staff&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New"&gt;SUBJECT: Courtroom strategy for Them v. Us&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New"&gt;Please do not bring your Blackberry into the courtroom on days when you are testifying. It makes us look bad, and if you keep looking at your Blackberry while giving answers, the opposing council may want to see it, too.&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;Transport rules can use many fields as inputs (also called 'conditions' or 'predicates') when making a decision about whether or not to classify an email. For example, in the sample email above, the fact that the sender was a member of the Legal group, and/or the recipient was a member of the Executive staff can be considered by the transport rules engine. Also, the appearance of the phrases &amp;quot;Them v. Us&amp;quot; and &amp;quot;courtroom strategy&amp;quot; in the body of the message can be added to the evaluation.&lt;/p&gt;  &lt;p&gt;Here's a complete list of conditions/predicates that Transport Rules can use to classify an email: &lt;a href="http://technet.microsoft.com/en-us/library/aa995960.aspx"&gt;http://technet.microsoft.com/en-us/library/aa995960.aspx&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;Is it a &lt;em&gt;guarantee&lt;/em&gt; that the sample message above is attorney-client privileged? No, but it's &lt;em&gt;likely&lt;/em&gt;. And knowing that is valuable, especially when it can be calculated for free.&lt;/p&gt;  &lt;h3&gt;Transport Rules bring three benefits to message classification:&lt;/h3&gt;  &lt;p&gt;1) They’re always working. They don't get tired, or forgetful, or confused. They don’t get in a hurry on Friday afternoon. They always fire, no matter what.&lt;/p&gt;  &lt;p&gt;2) They reduce the search scope. You're still going to have lawyers review emails as part of the e-discovery process, but anything you can do to reduce the search scope will save you money. If an expert is double-checking the 5% of messages that the machine thinks might be privileged, you’ve just made your search problem 20 times smaller.&lt;/p&gt;  &lt;p&gt;3) Classification happens immediately when the message is sent, there's no waiting around for a skilled person to review it. Suppose in our example, two employees who are not members of the legal team are discussing &amp;quot;Them v. Us.&amp;quot; Their conversation might be privileged and they don’t know it. Or their conversation might be forbidden by company policy. Or any email discussing the case cannot be sent outside the company. Transport Rules can help with all of that. &lt;/p&gt;  &lt;h3&gt;Transport Rules can do more than just classify. &lt;/h3&gt;  &lt;p&gt;Expanding upon example 3 above, I noted you can suppress (not deliver) certain kinds of emails. There's a whole host of actions that can be taken based upon the message's contents, including:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;Notifying managers, &lt;/li&gt;    &lt;li&gt;Logging the message in a special archive, &lt;/li&gt;    &lt;li&gt;Altering the contents by appending a disclaimer, &lt;/li&gt;    &lt;li&gt;Etc. &lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;Here's a complete list of actions that can be take: &lt;a href="http://technet.microsoft.com/en-us/library/aa998315.aspx"&gt;http://technet.microsoft.com/en-us/library/aa998315.aspx&lt;/a&gt;&lt;/p&gt;  &lt;h3&gt;Regular Expressions in Transport Rules&lt;/h3&gt;  &lt;p&gt;People sometimes ask me, &amp;quot;What kinds of patterns can I use? Can I trap social security numbers (nnn-nn-nnnn) or credit card numbers (nnnn-nnnn-nnnn-nnnn) in emails?&amp;quot;&lt;/p&gt;  &lt;p&gt;The answer is yes, but it gets pretty geeky pretty fast. Exchange uses a technology called Regular Expressions. These are pretty common in the world of programming (and Unix administration), but not for the faint of heart. Here's a quick primer on what they are : &lt;a href="http://en.wikipedia.org/wiki/Regular_expressions"&gt;http://en.wikipedia.org/wiki/Regular_expressions&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;What is so great about this feature is that it has all the power and flexibility of serious computer programming, but the features are exposed right in the Exchange Management Console: All you have to do is type your expression into a dialog box, there's no scripting or programming required to use the feature. &lt;/p&gt;  &lt;p&gt;Here are specifics for adding Regular Expressions to your Exchange Transport Rules: &lt;a href="http://technet.microsoft.com/en-us/library/aa997187.aspx"&gt;http://technet.microsoft.com/en-us/library/aa997187.aspx&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;In conclusion, an existing out-of-box Exchange feature can automatically categorize email, with varying degrees of precision depending upon how complex a rule you want to write. If you're concerned about end-user compliance with tagging schemes, or just want an extra layer of security and common sense around your message handling, Transport Rules are for you.&lt;/p&gt;&lt;img src="http://blogs.technet.com/aggbug.aspx?PostID=3219195" width="1" height="1"&gt;</description><category domain="http://blogs.technet.com/ediscovery/archive/tags/Exchange+2007/default.aspx">Exchange 2007</category><category domain="http://blogs.technet.com/ediscovery/archive/tags/Categorizing+data/default.aspx">Categorizing data</category></item><item><title>Tag! Episode 2: Email Messages Tagged While You Wait</title><link>http://blogs.technet.com/ediscovery/archive/2009/01/08/tag-episode-2-email-messages-tagged-while-you-wait.aspx</link><pubDate>Fri, 09 Jan 2009 04:31:24 GMT</pubDate><guid isPermaLink="false">d5e57398-b9ef-4490-9955-07cbb4e4a80d:3178562</guid><dc:creator>chris.chalmers</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.technet.com/ediscovery/comments/3178562.aspx</comments><wfw:commentRss>http://blogs.technet.com/ediscovery/commentrss.aspx?PostID=3178562</wfw:commentRss><description>&lt;p&gt;When email messages need to be examined for attorney/client privilege, discovery slow to a crawl. And obviously the expense of preparing for discovery goes way up. If only there were an easy way to get users to tag email messages so people don’t have to try so hard later on figuring out what’s relevant and what’s privileged. &lt;/p&gt;  &lt;p&gt;But most tagging systems are doomed to fail, due to lack of end-user participation. I recently had the pleasure of seeing Mark Diamond, CEO of Contoural (&lt;a href="http://www.contoural.com"&gt;www.contoural.com&lt;/a&gt;) give a presentation, and he insisted that tagging schemes needed to obey the “5-second Rule.”&lt;/p&gt;  &lt;h2&gt;The 5-Second Rule:&lt;/h2&gt;  &lt;p&gt;If it takes the user longer than 5 seconds to tag the document, (or record, or email), he or she is going to start looking for ways to get around the system instead of providing the required metadata.&lt;/p&gt;  &lt;p&gt;Pretty sad, but entirely believable. When you think of how much trouble an employee might be saving the company by correctly tagging a message, it just makes you shake your head. Then again, incorrectly tagging a message that probably will never get examined anyways is hardly going to bring one’s company to its knees. 5 seconds is about right. Fortunately, Outlook 2007 and Exchange 2007 have a new tool that fits the bill: Message Categorization&lt;/p&gt;  &lt;h2&gt;Exchange 2007 Message Categorization&lt;/h2&gt;  &lt;p&gt;This features lets Exchange administrators create a customized drop-down menu of message categories, like “Attorney/Client Privileged,” “Company Confidential,” etc. that end users can rapidly apply to any message (in less than 5 seconds). Administrators control the name of the Category that appears in the menu, the “helper text” that appears to the end user in Outlook, as well as any text appended to the message itself.&lt;/p&gt;  &lt;p&gt;Exchange and Outlook ship with two “default” classifications built-in, so you can see how it works, but the system is completely customizable, you’re free to create new categories or edit the default ones. Here’s a great overview of how it works: &lt;a title="http://technet.microsoft.com/en-us/library/bb123498.aspx" href="http://technet.microsoft.com/en-us/library/bb123498.aspx"&gt;http://technet.microsoft.com/en-us/library/bb123498.aspx&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;And here’s a screenshot of what the end user sees in Outlook 2007:&lt;/p&gt;  &lt;p&gt;&lt;img alt="Selecting the Message Classification" src="http://i.technet.microsoft.com/Bb123498.e5de8be8-c7e2-42b6-91f8-ffc6e857b75d(en-us,EXCHG.80).gif" /&gt;&lt;/p&gt;  &lt;p&gt; In our next post, we’ll examine additional rules and behaviors that Exchange administrators can configure based on message categorization. &lt;/p&gt;&lt;img src="http://blogs.technet.com/aggbug.aspx?PostID=3178562" width="1" height="1"&gt;</description><category domain="http://blogs.technet.com/ediscovery/archive/tags/Exchange+2007/default.aspx">Exchange 2007</category><category domain="http://blogs.technet.com/ediscovery/archive/tags/Categorizing+data/default.aspx">Categorizing data</category><category domain="http://blogs.technet.com/ediscovery/archive/tags/Outlook+2007/default.aspx">Outlook 2007</category></item><item><title>Tag! Metadata Made Easy in Vista</title><link>http://blogs.technet.com/ediscovery/archive/2008/10/19/tag-metadata-made-easy-in-vista.aspx</link><pubDate>Mon, 20 Oct 2008 03:47:00 GMT</pubDate><guid isPermaLink="false">d5e57398-b9ef-4490-9955-07cbb4e4a80d:3138839</guid><dc:creator>chris.chalmers</dc:creator><slash:comments>2</slash:comments><comments>http://blogs.technet.com/ediscovery/comments/3138839.aspx</comments><wfw:commentRss>http://blogs.technet.com/ediscovery/commentrss.aspx?PostID=3138839</wfw:commentRss><description>&lt;P&gt;Just in time for Halloween, a scary story about metadata! Who knows what awful, incriminating secrets lie hidden in your Word document's metadata, waiting to betray you? After all, isn't that why attorneys request "native format" document production? So they can revel in the smoking guns buried in your metadata that you never even knew you were transmitting?&lt;/P&gt;
&lt;P&gt;It wasn't supposed to be this way. And in Vista, it isn't. &lt;/P&gt;
&lt;P&gt;Metadata is supposed to help the user. And Vista makes it easier than ever to view and edit a document's metadata, search using metadata, and remove document metadata - all without having to launch the underlying application that created the file (like Word, Excel or PowerPoint).&lt;/P&gt;
&lt;P&gt;It can also be part of a larger "Manage in Place" strategy for e-discovery. Step One is simply making the end users aware of the metadata they generate, and Step Two is giving them an easy tool to edit it. Just another little way to make it easy to categorize documents in-place, and reduce the amount of unnecessary, discoverable data lying around your network.&lt;/P&gt;
&lt;H4&gt;&lt;STRONG&gt;So how do I make it easy to view metadata in Vista's Windows Explorer?&lt;/STRONG&gt; &lt;/H4&gt;
&lt;P&gt;First off, here's a quick overview of all the enhancements to Windows Explorer in Vista:&lt;/P&gt;
&lt;P&gt;&lt;A href="http://www.microsoft.com/windows/windows-vista/features/explorers.aspx" mce_href="http://www.microsoft.com/windows/windows-vista/features/explorers.aspx"&gt;http://www.microsoft.com/windows/windows-vista/features/explorers.aspx&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;We're primarily interested in the new "Details" pane at the bottom of the explorer. By right-clicking on the pane, you can choose to make your view Small, Medium, or Large. Here's a screenshot of the Details pane set to display "large" amounts of metadata. &lt;/P&gt;
&lt;P&gt;&lt;A href="http://blogs.technet.com/blogfiles/ediscovery/WindowsLiveWriter/TagMetadataMadeEasyinVista_1241B/metadata1_2.jpg" mce_href="http://blogs.technet.com/blogfiles/ediscovery/WindowsLiveWriter/TagMetadataMadeEasyinVista_1241B/metadata1_2.jpg"&gt;&lt;IMG style="BORDER-RIGHT-WIDTH: 0px; DISPLAY: inline; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px" title=metadata1 border=0 alt=metadata1 src="http://blogs.technet.com/blogfiles/ediscovery/WindowsLiveWriter/TagMetadataMadeEasyinVista_1241B/metadata1_thumb.jpg" width=244 height=144 mce_src="http://blogs.technet.com/blogfiles/ediscovery/WindowsLiveWriter/TagMetadataMadeEasyinVista_1241B/metadata1_thumb.jpg"&gt;&lt;/A&gt; &lt;/P&gt;
&lt;P&gt;(Thanks to the Windows Live Writer beta team, which made putting this screenshot into my blog incredibly simple).&lt;/P&gt;
&lt;P&gt;As you can see, it's easy to view the metadata associated with any file simply by clicking on it - opening is not required. Also, if you use the expanded "Details" view in Explorer, you can have any metadata field displayed, not just size and last creation date. &lt;/P&gt;
&lt;H4&gt;&lt;STRONG&gt;Next: Searching your metadata&lt;/STRONG&gt;&lt;/H4&gt;
&lt;P&gt;The Search panel is in the upper-right corner of the Explorer. Of course you can type in keywords like "short-sale" or phrases like "credit default swap." By default, these searches are applied to the full text of the documents you're searching. If you want to restrict your search to just the metadata fields, begin your query with the metadata field name. For example, these are valid metadata queries:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;comments: review&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;author: chris&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;And these queries can be strung together. For example, if Chris ever used the phrase "approved" or "bury it" in the comments of a document that Dave authored about Contoso, it would look like this: &lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Contoso author:dave comments:(“needs review” OR "fix this")&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Here's a complete description of the query language built into Windows Desktop Search (built into Vista, and a free download for Windows XP) &lt;A href="http://www.microsoft.com/windows/products/winfamily/desktopsearch/technicalresources/advquery.mspx" mce_href="http://www.microsoft.com/windows/products/winfamily/desktopsearch/technicalresources/advquery.mspx"&gt;http://www.microsoft.com/windows/products/winfamily/desktopsearch/technicalresources/advquery.mspx&lt;/A&gt;&lt;/P&gt;
&lt;H4&gt;&lt;STRONG&gt;What about removing the metadata?&lt;/STRONG&gt; &lt;/H4&gt;
&lt;P&gt;Ahh, the best feature of all. You can select one or several files in Windows Explorer, and right-click them and choose Properties…Details. From there you'll see the link to "Remove properties and personal information." Here's a screenshot&lt;/P&gt;
&lt;P&gt;&lt;A href="http://blogs.technet.com/blogfiles/ediscovery/WindowsLiveWriter/TagMetadataMadeEasyinVista_1241B/metadata2_2.jpg" mce_href="http://blogs.technet.com/blogfiles/ediscovery/WindowsLiveWriter/TagMetadataMadeEasyinVista_1241B/metadata2_2.jpg"&gt;&lt;IMG style="BORDER-RIGHT-WIDTH: 0px; DISPLAY: inline; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px" title=metadata2 border=0 alt=metadata2 src="http://blogs.technet.com/blogfiles/ediscovery/WindowsLiveWriter/TagMetadataMadeEasyinVista_1241B/metadata2_thumb.jpg" width=178 height=244 mce_src="http://blogs.technet.com/blogfiles/ediscovery/WindowsLiveWriter/TagMetadataMadeEasyinVista_1241B/metadata2_thumb.jpg"&gt;&lt;/A&gt; &lt;/P&gt;
&lt;P&gt;(Did you notice the "Last Printed" field? Interesting…) Note that you can remove all the metadata fields, or only some of them. &lt;/P&gt;
&lt;P&gt;Here's more information about how bulk-remove metadata from a group of files:&lt;/P&gt;
&lt;P&gt;&lt;A href="http://windowshelp.microsoft.com/Windows/en-US/Help/06605e2a-3f56-488b-8415-702813f24c791033.mspx#EQB" mce_href="http://windowshelp.microsoft.com/Windows/en-US/Help/06605e2a-3f56-488b-8415-702813f24c791033.mspx#EQB"&gt;http://windowshelp.microsoft.com/Windows/en-US/Help/06605e2a-3f56-488b-8415-702813f24c791033.mspx#EQB&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;Best of all, Windows will even prompt you to either overwrite the original file, or create a new metadata-free version of the file. &lt;/P&gt;
&lt;P&gt;Hopefully this post has encouraged you to embrace, or at least consider, Vista's improved metadata handling features. Even the simplest usage of metadata can be superbly helpful. For example, I review lots of PowerPoint presentations in my line of work: Simply adding the word "good" to the Comments field of the ones I liked makes it much easier to find them later. &lt;/P&gt;
&lt;P&gt;And if you're afraid the metadata bogeymen are out to get you, you can now banish them forever with a couple of mouse clicks. Happy Halloween!&lt;/P&gt;&lt;img src="http://blogs.technet.com/aggbug.aspx?PostID=3138839" width="1" height="1"&gt;</description><category domain="http://blogs.technet.com/ediscovery/archive/tags/Windows+Vista/default.aspx">Windows Vista</category><category domain="http://blogs.technet.com/ediscovery/archive/tags/Categorizing+data/default.aspx">Categorizing data</category></item></channel></rss>