The Storage Team Blog about file services and storage features in Windows and Windows Server.
Over the long weekend, one of my colleagues and I were discussing the new File Classification Infrastructure in the upcoming R2 release.
My colleague asked me: “Why do you keep emphasizing the importance of separating the classification from the action – why not just take the action as you classify the file, for example: if a product that does content based analysis finds that a file has personal information on a public server, then it should delete it immediately”
Needless to say, I agree with my colleague that this is a good action to take during classification but I also pointed out that this is a very partial solution. For example, what if the file is not on a public server? We probably do not want to delete it but we also want to make sure that when the file is backed up, it will be backed up to an encrypted store so that if the tape is lost, the information does not leak out. Also, what if the file was classified as having “personal information” by a user and not by the content based analysis? It still needs to be deleted from the public server (if this is the policy we want to apply of course).
My colleague summarized this nicely by saying: “So, my example above was just one case of classifying and applying policy”
We went on to discuss the example of the content analysis product marking the file as having personal information and the backup application then backing up all personal information files to an encrypted store
I said that the way we approached this with the File Classification Infrastructure is to allow the IT administrator to define a property that will be attached to the file to indicate that the file has personal information (e.g.: “PersonalInformation=Yes”) and then, the content analysis product will assign this property to the file (or a user can set this property to the file). The administrator can then configure the backup application that all files that are classified as “PersonalInformation=Yes” need to be backed up to an encrypted store (of course, the backup application needs to be “classification” enabled)
He agreed that this works but wanted to know why not just have the backup application work directly with the content analysis product to determine which action needs to be taken – in his words: “Instead of setting PersonalInformation=Yes” on the file, the content analysis product can just set which policy the backup application should take – say a GUID that the backup application recognizes and knows to backup this file to an encrypted store.”
On the surface, this sounds like a great idea (and works well for some scenarios such as rights management) but I asked him what happens if a new application (say archival application) is introduced to the mix – how does this application understand what to do with this GUID? Furthermore, what if the backup application does not work with the specific content analysis application and what does he expect the user to enter when he manually classifies files as having “Personal Information”
We could have continued this back and forth for awhile. There are certainly many potential variants of technological solutions to the complex data management problems that organizations are facing today
In R2, we tried to keep it simple and empower the IT administrators to get insight into the organization's data so that they can apply actions based on that insight:
· The Organization defines the properties that should be attached to files (Taxonomy)
· Classify: Classification (Manual and automatic) assigns properties to files (e.g.: “PersonalInformation=Yes”)
· Apply policy based on classification: Data management is configured based on file properties (e.g.: backup all files with “PersonalInformation=Yes” to an encrypted store)
These additional blogs provide deep dives into how to leverage File Classification Infrastructure (FCI) in your IT environment and how to develop solutions to further plug-in and enhance FCI:
· Classifying files based on location and content using the File Classification Infrastructure (FCI) in Windows Server 2008 R2
· Dealing with stale data on File Servers
· Customizing File Management Tasks
(This is a follow up to the blog entry that presented the File Classification Infrastructure in Windows Server 2008 R2)
Post by Nir Ben Zvi
PingBack from http://www.ditii.com/2009/05/28/technet-webcast-file-classification-infrastructure/