Using Windows PowerShell Scripts for File Classification

Using Windows PowerShell Scripts for File Classification

  • Comments 1
  • Likes

by Anupadmaja Raghavan

The File Classification Infrastructure in Windows Server 2008 R2 is great for classifying files based on file content and/or location. You can define rules that can discover sensitive files (such as those containing social security card or SSN numbers) based on the presence of certain text patterns in all your files. See http://blogs.technet.com/filecab/archive/2009/05/11/classifying-files-based-on-location-and-content-using-the-file-classification-infrastructure-fci-in-windows-server-2008-r2.aspx.

We also heard from IT administrators that they would like to classify files based on other methods such as file types, custom parsing techniques for certain file formats, incorporating information from AD, etc. FCI (the File Classification Infrastructure) provides you an extensible model that you can use to define your own classifiers. You can write today classifiers in languages such as C++ and C# as illustrated in the SDK (http://go.microsoft.com/fwlink/?LinkID=150217&clcid=0x409 http://www.microsoft.com/downloads/details.aspx?FamilyID=6db1f17f-5f1e-4e54-a331-c32285cdde0c&displaylang=en).

You might wonder at this point, I am an IT administrator - is it possible to write classification rules without writing a complicated C# or C++ DLL? Like writing a simple PowerShell script? The answer is yes.

The FCI team has developed a custom classifier module, the PowerShell classifier module, which is capable of using PowerShell scripts to perform custom classification of files.

The PowerShell classifier module consists of two components

  1. the PowerShell host classifier and
  2. a PowerShell script.

The PowerShell host classifier has been developed by the FCI team for the convenience of administrators. The PowerShell host classifier takes care of plugging into FCI and PowerShell. Administrator has to only provide the PowerShell script required to do the custom classification.

How to use the PowerShell Classifier Module

The way this works is, the PowerShell host classifier of the PowerShell classifier module is registered with FCI as a classifier module. Then a classification rule is setup in FCI that defines the parameters that need to be sent to the PowerShell host classifier when classification runs. During classification, for every file that it scans, FCI passes the file with the parameters from the rule to the PowerShell host classifier. One of the mandatory parameters that the PowerShell host classifier takes is the name of the PowerShell script file to use. For each file it receives from FCI, the PowerShell host classifier calls into the PowerShell script that determines the classification for that file and sends the result back to FCI. FCI finally reads back the result of classification and proceeds with other actions to complete the classification for that file.

The steps involved in using the PowerShell classifier module are as follows:

Once the PowerShell host classifier of the PowerShell classifier module is installed and registered, administrator has two main actions to perform:

  1. Configuring classification rule to use the PowerShell classifier module:
    1. Create a classification rule R1 with standard rule settings. See http://blogs.technet.com/filecab/archive/2009/05/11/classifying-files-based-on-location-and-content-using-the-file-classification-infrastructure-fci-in-windows-server-2008-r2.aspx for samples.
    2. Then in the Classification tab, choose “PowerShell Host Classifier” as the classifier module.
    3. Choose the property you would like to choose say P1 and specify the values it can take.
    4. Then click the “Advanced…” button and in the “Additional Classification Parameters” tab, specify the mandatory parameter with name “ScriptFileName” and value as the full path to the PowerShell script that we will be using for the custom classification, say “C:\PCM\ClassificationScript.ps1”.
      This is how we tell the PowerShell host classifier that “C:\PCM\ClassificationScript.ps1” is the script file to use for custom classification which will be executed for each file being scanned within the scope of this rule.
    5. Click OK and complete the rule creation.
  2. Creating the PowerShell script for custom classification.
    1. The PowerShell host classifier receives the name of the script file to call from FCI and invokes the script for each file that is sent to it for classification.
    2. The PowerShell script must contain a Process block. In the Process block, the pipeline input will be an IFsrmPropertyBag as $PropertyBag. Other inputs available to the script file are documented in the ReadMe file in the SDK.
    3. The script then does its operations using values from the IFsrmPropertyBag and then sends back results. Note: the path to the file being processed can be read from IFsrmPropertyBag.Name.
    4. The script operates in one of two modes:
  • Yes-No mode: In this mode, the script does its operations and sends back a result indicating Yes or No. Yes, means the property value in the rule should be applied on the file. No, means the property value in the rule should not be applied on the file. Say we need to classify any file containing the term “Presentation” in its path or name, with property “IsSpecial” set to true. We will create a rule as above with property “IsSpecial” and value true. We will include the PowerShell host classifier and point it to a script file that looks as follows:

 

Process
{
    ################################
    ### Get the file name and path
    ################################
    $PropertyBag = $_
    $FileName = $PropertyBag.Name
    $FilePath = $PropertyBag.RelativePath
    $SpecialString = ‘Presentation’
    If (($FileName.IndexOf($SpecialString) –lt 0)
        –and
        $FilePath.IndexOf($SpecialString) –lt 0))
    {
        $false
        ### Dont set property “IsSpecial” on the
        ### file
    }
    Else
    {
        $true
        ### Set property “IsSpecial” on the file
    }
}
  • Classifier supplied value mode: In this mode, the script does its operations and sends back a result containing the value of the property to be set on the file. It does not send back anything if property value does not need to be updated. Say we need to classify any file containing the term “Confidential” in its path or name, with property “Type” set to “Confidential”. We will create a rule as above with String property “Type”. We will include the PowerShell host classifier and point it to a script file that looks as follows:

 

Process
{
    ################################
    ### Get the file name and path
    ################################
    $PropertyBag = $_
    $FileName = $PropertyBag.Name
    $FilePath = $PropertyBag.RelativePath
    $SpecialString = ‘Confidential’
    If (($FileName.IndexOf($SpecialString) –lt 0)
        –and
        $FilePath.IndexOf($SpecialString) –lt 0))
    {
        ### No property update is sent back
    }
    Else
    {
        $SpecialString ### Send back property value
        ### to be set on the file
    }
}

The PowerShell host classifier module provides a simple way to do custom classification using PowerShell scripts. More details on the topics discussed in this post and other capabilities of the PowerShell classifier module are available in the SDK. Check back with us for a post on “How to do Content Classification of Files using Windows PowerShell scripts”.

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
  • > You can write today classifiers in languages such

    > as C++ and C# as illustrated in the SDK

    I found sample code in C# which uses Fsrm API to classify files. But there seems to be no sample code for C++. I am facing very basic level difficulties, such as what files to include, what libs to link with and what DLL actually implements the interface. MSDN documentation of IFsrmClssificationManager ( http://msdn.microsoft.com/en-us/library/dd392349(VS.85).aspx) does not talk about it.

    Would you please talk more on these aspects?

    Thanks,

    -Nilesh.