PowerShell Mini-Scripting Games 2014: Answer 1

PowerShell Mini-Scripting Games 2014: Answer 1

  • Comments 6
  • Likes

Summary: Here is the answer to Problem 1 in Windows PowerShell Mini-Scripting Games 2014.

Microsoft Scripting Guy, Ed Wilson, is here. Here is an answer to Problem 1 in Windows PowerShell Mini-Scripting Games 2014. Last week, I posted the complete Problem 1 description and task outline.

Problem one synopsis

You are reading a CSV file that was created from another source. The CSV file has four fields that are to be read and then used to update four attributes in Active Directory. The problem is that the CSV file does not have all four fields filled out. In some cases, there is a space in the field, in others a tab, others are null, and others are empty.

When a field does not have a proper value, the attempt to update the attribute in Active Directory results in an error message. Your boss has decided to make the script your problem.

Solution

You need to write some logic that will identify all sorts of white space, including null fields. Detecting a null, a space, or a whatever seems to beg for a nice regular expression. It is also possible to do this manually by using a series of if statements.

Note  For a great reference, see Regular Expression Language - Quick Reference.

For example, if I want to match a space, I can simply supply a space, as shown here:

PS C:\> $a = " "

PS C:\> $a -match " "

True

Is $a null? I can also test that:

PS C:\> $a = $null

PS C:\> $a -match $null

True

I can incorporate this into logic, and use code that looks like the following:

PS C:\> If ($a -match " " -OR $a -match $null) {'$a is either a space or null'}

$a is either a space or null

Match a whitespace character

I can use a character class to try to match the empty field. Whitespace characters are things like a space, a tab, a carriage return, or a line feed. The regular expression character class for whitespace characters is backslash with a lowercase s: ( \s ). Windows PowerShell uses a backtick with a lowercase t for a Tab: ( `t ).

Note  Regular expression character classes are case sensitive. There is an excellent description of character classes on MSDN: Character Classes in Regular Expressions.  

In my first example, I use the Windows PowerShell `t character to store a Tab character in the variable $a. I then display it to the console, which is really nothing to see.

PS C:\> $a = "`t"

PS C:\> $a

Now, I attempt to match a space, which does not match a Tab. This is shown here:

PS C:\> $a -match " "

False

Now I use the \s whitespace character class from Regular Expressions, and I can see that it does match:

PS C:\> $a -match "\s"

True

Of course, it will also match the Windows PowerShell Tab character, `t:

PS C:\> $a -match "`t"

True

But the whitespace class does not match a null, as shown here:

PS C:\> $a = $null

PS C:\> $a -match "\s"

False

But it will match one or more spaces. This is shown here:

PS C:\> $a = " "

PS C:\> $a -match "\s"

True

PS C:\> $a = "  "

PS C:\> $a -match "\s"

True

How about not matching

Rather than trying to match every possible blank or empty sort of thing, how about if I simply don’t match any letter or number? I can easily do this by specifying a couple of character groups. All lower case letters, all upper case letters, and all numbers will pretty much do it. To specify a character group, I use square brackets for all lower case letters [a-z] or for all upper case letters [A-Z]. The asterisk ( * ) means zero or more instances of the characters.

I create my two character groups and store them in a variable I call pattern:

$pattern = "[a-z][A-Z]*"

Now I check to see if it does not match my various test conditions. First I check to see if It does not match a blank space:

PS C:\> $pattern = "[a-z][A-Z]*"

PS C:\> $a = " "

PS C:\> $a -notmatch $pattern

True

Then I check to see if it does not match a null:

PS C:\> $a = $null

PS C:\> $a -notmatch $pattern

True

Now I check to see if it does not match a Tab and a carriage return line feed:

PS C:\> $a = "`t"

PS C:\> $a -notmatch $pattern

True

PS C:\> $a = "`r`n"

PS C:\> $a -notmatch $pattern

True

Application to the problem

So, if I create a simple CSV file that has a few empty fields, it might look like the following:

"f1","f2","f3","f4"

"a","b","c","d"

"","b","c","d"

"a"," ","c","d"

Now, I write a script that reads this CSV file, and contains the pattern I used earlier. To walk through the CSV file, I use the GetEnumerator() method. I then use While and the MoveNext() method to proceed from each record to the next one. I then use an IF statement to see if I find a match. Here is the script:

PROBLEM1SOLUTION.PS1

$data = Import-Csv -Path C:\DataIn\q1.csv

$pattern = "[a-z][A-Z]*"

     $e = $data.GetEnumerator()

     While ($e.MoveNext())

      {

       If ($e.current.f1 -notmatch $pattern) {"$($e.current)  does not match"}

       If ($e.current.f2 -notmatch $pattern) {"$($e.current)  does not match"}

       If ($e.current.f3 -notmatch $pattern) {"$($e.current)  does not match"}

       If ($e.current.f4 -notmatch $pattern) {"$($e.current)  does not match"}

       } 

That is one approach to the problem. Mini-Scripting Games Answer Week will continue tomorrow when I will have a solution for Problem 2.

I invite you to follow me on Twitter and Facebook. If you have any questions, send email to me at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.

Ed Wilson, Microsoft Scripting Guy 

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
  • This is what I came up with. $process will contain only rows that don't contain null or whitespace values:

    $csv = Import-CSV "C:\Test\Test.csv"

    $process = @()

    $process = foreach($item in $csv) {
    $badValue = $false
    $item.PSObject.Properties | foreach{
    if ([string]::IsNullOrWhiteSpace($_.Value)){
    $badValue = $true
    }

    }
    if($badValue -eq $false){$item}
    }
    $process

    The test CSV provided would result in:

    f1 f2 f3 f4
    -- -- -- --
    a b c d

  • This is what I did.

    $csvData = Import-Csv C:\Users\kbhunut\Desktop\ps.csv
    $Headers = ($csvData | gm -MemberType NoteProperty).Name
    $rowCnt = 2
    foreach($row in $csvData){

    foreach($curCol in $Headers){
    if(!$row.$curCol -or $row.$curCol -eq " " -or $row.$curCol -eq "'t"){

    Write-Output "Row number $rowCnt and Column $curCol is Empty"
    }
    }
    $rowCnt++
    }


    My Coworker came up with this. By using Trim Method

    $path = "C:\users\kbhunut\Desktop\ps.csv"

    $c=2
    Import-Csv $path | Foreach-Object {

    foreach ($property in $_.PSObject.Properties){
    if(!$property.Value.Trim()){
    Write-Output "Row $c : Missing $($property.Name)"
    }
    }
    $c++
    }

  • I thought of using a regex too, however I took the reverse approach, i.e. looking explicitely for spaces and tabs:

    if ($a -match "^[\s\t]{0,}$") {'$a is not valid'}

    BTW if a field contains characters with accents (like é or à) or any other special chatacter they will fail the test with [a-z][A-Z], but maybe it was intended:

    $a = "é"
    $pattern = "[a-z][A-Z]*"
    $a -match $pattern
    False

    Thanks for these games, they were a lot of fun.

  • I took the task to be one of extracting the rows that had all four values populated, and I wrote this one-liner:

    Import-CSV "c:\test\test.csv' -header "a","b","c","d" | where-object {$_.a -match '\S' -and $_.b -match '\S' -and $_.c -match '\S' -and $_.d -match '\S'}