Expert Commentary: 2012 Scripting Games Advanced Event 6

Expert Commentary: 2012 Scripting Games Advanced Event 6

  • Comments 11
  • Likes

Summary: Microsoft senior software engineer on the Windows PowerShell team, Lee Holmes, provides expert commentary for 2012 Scripting Games Advanced Event 6.

Microsoft Scripting Guy, Ed Wilson, is here. Lee Holmes is the expert commentator for Advanced Event 6.

Photo of Lee Holmes

Lee is a senior software engineer on the Microsoft Windows PowerShell team, and he has been an authoritative source of information about Windows PowerShell since its earliest betas. He is the author of the Windows PowerShell Cookbook, Windows PowerShell Pocket Reference, and the Windows PowerShell Quick Reference. 
 
Blog: Precision Computing 
Twitter: http://www.twitter.com/Lee_Holmes 
LinkedIn: http://www.linkedin.com/pub/lee-holmes/1/709/383

The script for the 2012 Scripting Games Advanced Event 6 is pretty descriptive, so rather than go over it again line-by-line, I thought it’d be helpful to talk about two of the main ideas that went into creating the script: jobs and streaming.

Networking is slow, parallel jobs are not

When running a large network-bound operation (such as retrieving the Win32_OperatingSystem class from a long list of computers), your computer spends the vast majority of its time waiting on the network: waiting for the connection, waiting for the computer to respond, and waiting for the data to get back.

To address this problem, Windows PowerShell 2.0 introduced the concept of “jobs”–primarily in remoting, WMI, and eventing. When you assign a multimachine task to a job, Windows PowerShell distributes your commands among many worker threads, which it then runs in parallel. Each worker thread processes one computer at a time. As each child job completes (for example, a remote WMI query against a specific computer), Windows PowerShell feeds that worker thread a command for another computer. By default, Windows PowerShell launches 32 child jobs in parallel.

What’s most amazing about Windows PowerShell jobs running 32 tasks in parallel is not that it makes things 32 times faster. It’s that it makes them even faster than that! The reason is a branch of computer science called “queueing theory.” Here’s a summary for the busy admin…

As a thought experiment, consider running a query against 92 computers one-by-one. Also, imagine that your first and second computers are rebooting, and they take a minute before failing to return their response. The other 90 return their responses in two seconds each.

The total time for that query is five minutes, for an average of about 3.3 seconds per successful computer. This is the same problem that happens when a bunch of hungry shoppers get stuck behind the crazy person paying with 100 coupons at the grocery store: every delay impacts everybody in line.

Now, consider the impact of parallel jobs in Windows PowerShell. Those first two rebooting computers use up two of our available worker threads for an entire minute, but we still have 30 more to process the remaining 90 computers. The total time for that query is about six seconds, giving an average time of about 0.07 seconds per successful computer. That’s 50 times faster than processing one computer at a time! Surprising, but incredibly cool. In the grocery store, this is like waiting in a single line, but having 32 cashiers serving from it.

Streaming is important

Given that this is likely to be a long-running script, you’re going to want to keep your results as dynamic as possible.

As the first step toward that, this script makes good use of the –Verbose and –Progress streams. As the script processes computers, it emits a progress message telling you which one it’s working on. If you specify the –Verbose parameter, you get even more detail—specifically, the uptime information as it writes it to the CSV.

This approach solves a common problem that I see: scripts that force their verbose or debugging information on the end user. The ultimate example of this sin is aggressive use of the Write-Host cmdlet. When you run this kind of script, you get reams and reams of text on screen, with no way to silence it. You can’t tell good information from bad, and other scripts can’t use it without this internal debugging information spewing all over the screen.

In addition to streaming progress, the script also streams its output. Although it would be easiest to collect all of the job output into a variable and then dump it to the CSV, all of your data is lost if you ever cancel the script while it is executing. When your script streams its output, you can easily monitor the results as they are received with this simple command:

Get-Content 20120409_Uptime.csv -Wait

If your query takes an hour to complete, it’s nice to be able to check its progress before then.

Jobs and streaming—two useful techniques to maximize the efficiency of long-running tasks. Now, here is my solution for Advanced Event 6:

##############################################################################

##

## Get-DistributedUptime

##

##############################################################################

 

<#

 

.SYNOPSIS

 

Retrieves the uptime information (as of 8:00 AM local time) for the list of

computers defined in the $computers variable. Output is stored in a

date-stamped CSV file in the "My Documents" folder, with a name ending in

"_Uptime.csv".

 

.EXAMPLE

 

Get-DistributedUptime

 

#>

 

param(

    ## Overwrites the output file, if it exists

    [Parameter()]

    [Switch] $Force

)

 

## Set up common configuration options and constants

$reportStart = Get-Date -Hour 8 -Minute 0 -Second 0

$outputPath = Join-Path ([Environment]::GetFolderPath("MyDocuments")) `

    ("{0:yyyyddMM}_Uptime.csv" -f $reportStart)

 

## See if the file exists. If it does (and the user has not specified -Force),

## then exit because the script has already been run today.

if(Test-Path $outputPath)

{

    if(-not $Force)

    {

        Write-Verbose "$outputPath already exists. Exiting"

        return

    }

    else

    {

        Remove-Item $outputPath

    }

}

 

## Get the list of computers. If desired, this list could be ready from

## a test file as well:

## $computers = Get-Content computers.txt

$computers = "EDLT1","EDLT2","EDLT3","EDLT4"

 

## Start the job to process all of the computers. This makes 32

## connections at a time, by default.

$j = Get-WmiObject Win32_OperatingSystem -ComputerName $computers -AsJob

 

## While the job is running, process its output

do

{

    ## Wait for some output, then retrieve the new output

    $output = @(Wait-Job $j | Receive-Job)

 

    foreach($result in $output)

    {

        ## We got a result, start processing it

        Write-Progress -Activity "Processing" -Status $result.PSComputerName

 

        ## Convert the DMTF date to a .NET Date

        $lastbootupTime = $result.ConvertToDateTime($result.LastBootUpTime)

 

        ## Subtract the time the report run started. If the system

        ## booted after the report started, ignore that for today.

        $uptimeUntilReportStart = $reportStart - $lastbootupTime

        if($uptimeUntilReportStart -lt 0)

        {

            $uptimeUntilReportStart = New-TimeSpan

        }

 

        ## Generate the output object that we're about to put

        ## into the CSV. Add a call to Select-Object at the end

        ## so that we can ensure the order.

        $outputObject = New-Object PSObject -Property @{

            ComputerName = $result.PSComputerName;

            Days = $uptimeUntilReportStart.Days;

            Hours = $uptimeUntilReportStart.Hours;

            Minutes = $uptimeUntilReportStart.Minutes;

            Seconds = $uptimeUntilReportStart.Seconds;

            Date = "{0:M/dd/yyyy}" -f $reportStart

        } | Select ComputerName, Days, Hours, Minutes, Seconds, Date

 

        Write-Verbose $outputObject

 

        ## Append it to the CSV. If the CSV doesn't exist, create it and

        ## PowerShell will create the header as well.

        if(-not (Test-Path $outputPath))

        {

            $outputObject | Export-Csv $outputPath -NoTypeInformation

        }

        else

        {

            ## Otherwise, just append the data to the file. Lines

            ## zero and one that we are skipping are the header

            ## and type information.

            ($outputObject | ConvertTo-Csv)[2]  >> $outputPath

        }

    }

} while($output)

~Lee

2012 Scripting Games Guest Commentator Week Part 2 will continue tomorrow when we will present the scenario for Event 7.

I invite you to follow me on Twitter and Facebook. If you have any questions, send email to me at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.

Ed Wilson, Microsoft Scripting Guy

 

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
  • Hello Lee,

    it looks like a very good idea to start the WMI query as job and I can follow your thoughts.

    But I couldn't get the script to work :-(

    It looks like    

    $output = @(Wait-Job $j | Receive-Job)

    is empty if I use a list of computers!

    If I use only my local computer alone, it seems to work, but other errors appear in the end, like

    Write-Progress : Das Argument kann nicht an den Parameter "Status" gebunden werden, da es NULL ist.

    I really would like to use jobs and code checking the results like you did.

    Perhaps you've got any ideas, why this won't work neither at home nor at work?

    Klaus

  • I am seeing similar results as @K_Schulte. With more than 1 machine, $output is empty. The following also does not return anything

    $j | Receive-Job

    whereas Receive-Job -job $j works fine.

    If the ChildJobs.Count is more than 1, this behavior is observed. If only 1 machine is specified ( ChildJobs.Count is 1 ), output is displayed properly.

  • I believe Get-WmiObject does not use the remoting infrastructure, hence the PSComputerName property will not be available in the result, we can use the __SERVER property instead. Otherwise, as @K_Schulte mentioned, we get an error in Write-Progress, and the excel file created does not contain the computer name.

  • This makes me feel good about using Jobs in my entry but I can see I have a bit to learn. This is a good one.

  • The help for Get-WmiObject notes the following for the -AsJob parameter:

    Note: To use this parameter with remote computers, the local and remote computers must be configured for remoting. Additionally,

    you must start Windows PowerShell by using the "Run as administrator" option in Windows Vista and later versions of Windows,. For

    more information, see about_Remote_Requirements.

    Srisas: In PowerShell 3.0, the __Server property will have a PSComputerName alias for consistency.

  • Thanks, Jason. Yes, Powershell 3.0 does have this alias, was trying out the solution on Powershell 2.0 ( which is what I used for the Scripting Games ) and this alias is not present in 2.0.

  • I didn't use background jobs. After reading, as Jason did, that using -AsJob requires remoting on both the local and target systems. I'd read Ed's hints that he'd posted the day before and noted that the scenario did not specify that I could expect all the servers to be 2008R2 with remoting enabled.  That caused me to think that using that as a solution might produce a failing grade if it was tested against W2K servers.

  • @Jason and @mjolinor - It requires only WMI to be enabled. Not PowerShell Remoting. Can you file a doc bug on connect.microsoft.com/PowerShell

  • On Powershell V3, $output = @(Wait-Job $j | Receive-Job) also worked fine. The script as provided by Lee works fine with Powershell V3, but gives the errors mentioned in the comment by @K_Schulte in Powershell V2.

  • @K_Schulte @Srisas To make it work in V2, do the following changes:

    Line #63: $output = @(Wait-Job $j | ForEach-Object {Receive-Job $_})

    Explanation: In PowerShell 3.0 we got a new feature called "Property Unrolling", which made this line work in V3 without using a foreach:

    www.nivot.org/.../PowerShell-30%E2%80%93Now-with-Property-Unrolling!.aspx

    Line #71: Write-Progress -Activity "Processing" -Status $result.__server

    Explanation: Like Jason Hofferle said; In PowerShell 3.0, the __Server property will have a PSComputerName alias for consistency.

  • @K_Schulte @Srisas I made some customizations to the script, so the line numbers I mentioned above is incorrect. The correct numbers from the original script is 61 and 66.

    I also forgot line 83:

    ComputerName = $result.__server;

    One additional tip: You may want to use Unicode encoding on Export-Csv:

    $outputObject | Export-Csv $outputPath -NoTypeInformation -Encoding Unicode