Weekend Scripter: Where.exe—The What, Why, and How

Weekend Scripter: Where.exe—The What, Why, and How

  • Comments 8
  • Likes

Microsoft Scripting Guy Ed Wilson here. The sounds outside seems amplified by the inky darkness that grudgingly gives way to the early morning sun. I am sitting on the front porch sipping a cup of English Breakfast Tea, munching on a freshly baked cinnamon scone, and hanging out on Twitter via my laptop. Having wireless access throughout our house makes possible an Internet life without boundaries. Of course, in my office, I have a gigabit switched ethernet network, but from general roaming around The House That Scripting Built, the 54 Mbps I get from my Wireless G (802.11g) broadband router is sufficient.

As I sipped my morning cup of tea, I reflected on the “work week” that recently passed. Another week on the calendar seems to slip into the sands of time. One hundred and twenty hours are gone—and what was accomplished? Well, let’s see. I spent nearly eight hours in meetings, and at least as much time answering email. I wrote a series of articles on using Windows PowerShell and the Active Directory cmdlets that were pretty cool. I also spent some time on Facebook, and hanging out on Twitter.

Speaking of Twitter, I had an intriguing conversation with a person who is a system administrator and Windows PowerShell scripter from Rotterdam who was talking about the performance of the Get-ChildItem cmdlet when compared to using Where.exe. Hmm … I said.

The first thing to realize is that inside Windows PowerShell, you must use Where.exe if you intend to call the “where command.” The reason for this, is that where is an alias for the Where-Object cmdlet. If you use where without supplying the .exe extension, an error occurs, as shown in the following image.

Image of error shown when where is used without .exe extension

When I add the .exe extension to the where command, I am rewarded with an output that displays all of the pictures I have put in my Hey, Scripting Guy! Blog posts in the month of June. The /R means to recurse. The command is shown here:

 

PS C:\> where.exe /R c:\data HSG-7*.jpg
c:\data\ScriptingGuys\2010\HSG_7_5_10\HSG-7-6-10-01.jpg
c:\data\ScriptingGuys\2010\HSG_7_5_10\HSG-7-6-10-02.jpg
c:\data\ScriptingGuys\2010\HSG_7_5_10\HSG-7-6-10-03.jpg
c:\data\ScriptingGuys\2010\HSG_7_5_10\HSG-7-6-10-04.jpg
c:\data\ScriptingGuys\2010\HSG_7_5_10\HSG-7-6-10-05.jpg
c:\data\ScriptingGuys\2010\HSG_7_5_10\HSG-7-6-10-06.jpg
c:\data\ScriptingGuys\2010\HSG_7_5_10\HSG-7-6-10-07.jpg
c:\data\ScriptingGuys\2010\HSG_7_5_10\HSG-7-6-10-08.jpg
c:\data\ScriptingGuys\2010\HSG_7_5_10\HSG-7-6-10-09.jpg
c:\data\ScriptingGuys\2010\HSG_7_5_10\HSG-7-7-10-01.jpg
c:\data\ScriptingGuys\2010\HSG_7_5_10\HSG-7-8-10-01_OldSeanPic.jpg
PS C:\>

To perform the same command using the Get-ChildItem cmdlet, you would specify the path, and use a path parameter and the recurse switch. This command and the results it produces are shown here:

 

PS C:\> Get-ChildItem -Path c:\data -Include HSG-7*.jpg -Recurse
Directory: C:\data\ScriptingGuys\2010\HSG_7_5_10
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- 6/30/2010 1:32 PM 5495 HSG-7-6-10-01.jpg
-a--- 6/30/2010 1:34 PM 20035 HSG-7-6-10-02.jpg
-a--- 6/30/2010 1:35 PM 53348 HSG-7-6-10-03.jpg
-a--- 6/30/2010 1:36 PM 39764 HSG-7-6-10-04.jpg
-a--- 6/30/2010 1:37 PM 39875 HSG-7-6-10-05.jpg
-a--- 6/30/2010 1:37 PM 13774 HSG-7-6-10-06.jpg
-a--- 6/30/2010 1:39 PM 20299 HSG-7-6-10-07.jpg
-a--- 6/30/2010 1:40 PM 17651 HSG-7-6-10-08.jpg
-a--- 6/30/2010 1:40 PM 39922 HSG-7-6-10-09.jpg
-a--- 6/30/2010 5:45 PM 4104 HSG-7-7-10-01.jpg
-a--- 4/15/2010 9:04 PM 23410 HSG-7-8-10-01_OldSeanPic.jpg
PS C:\>

When I ran the two commands, I noticed that the Get-ChildItem command seemed to take longer to complete. I therefore decided to pull out the Measure-Command cmdlet to see how long each command takes to complete. When using Where.exe, the command takes 1.04 seconds. This is shown here:

 

PS C:\> Measure-Command {where.exe /R c:\data HSG-7*.jpg}
Days : 0
Hours : 0
Minutes : 0
Seconds : 1
Milliseconds : 46
Ticks : 10466861
TotalDays : 1.21144224537037E-05
TotalHours : 0.000290746138888889
TotalMinutes : 0.0174447683333333
TotalSeconds : 1.0466861
TotalMilliseconds : 1046.6861
PS C:\>

When I used the Measure-Command cmdlet to test the performance of the Get-ChildItem cmdlet, it tells me that the command took 5.9 seconds to run. This is shown here:

 

PS C:\> Measure-Command {Get-ChildItem -Path c:\data -Include HSG-7*.jpg -Recurse}
Days : 0
Hours : 0
Minutes : 0
Seconds : 5
Milliseconds : 923
Ticks : 59235892
TotalDays : 6.85600601851852E-05
TotalHours : 0.00164544144444444
TotalMinutes : 0.0987264866666667
TotalSeconds : 5.9235892
TotalMilliseconds : 5923.5892
PS C:\>

One of the big things about using Windows PowerShell cmdlets is that they always return objects. When using the Where.exe command, it returns strings. But because the Where.exe command returns strings, it might seem that it is not the best tool to use with Windows PowerShell. However, the path parameter of the Get-Item cmdlet will accept a string, so I can pipe the results of Where.exe to the Foreach-Object cmdlet and inside the script block I can use Get-Item to return a System.IO.FileInfo .NET Framework class. This is command is shown here:

 

PS C:\> where.exe /R c:\data HSG-7*.jpg | ForEach-Object { Get-Item -Path $_ }
Directory: C:\data\ScriptingGuys\2010\HSG_7_5_10
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- 6/30/2010 1:32 PM 5495 HSG-7-6-10-01.jpg
-a--- 6/30/2010 1:34 PM 20035 HSG-7-6-10-02.jpg
-a--- 6/30/2010 1:35 PM 53348 HSG-7-6-10-03.jpg
-a--- 6/30/2010 1:36 PM 39764 HSG-7-6-10-04.jpg
-a--- 6/30/2010 1:37 PM 39875 HSG-7-6-10-05.jpg
-a--- 6/30/2010 1:37 PM 13774 HSG-7-6-10-06.jpg
-a--- 6/30/2010 1:39 PM 20299 HSG-7-6-10-07.jpg
-a--- 6/30/2010 1:40 PM 17651 HSG-7-6-10-08.jpg
-a--- 6/30/2010 1:40 PM 39922 HSG-7-6-10-09.jpg
-a--- 6/30/2010 5:45 PM 4104 HSG-7-7-10-01.jpg
-a--- 4/15/2010 9:04 PM 23410 HSG-7-8-10-01_OldSeanPic.jpg
PS C:\>

Well, the sun is coming up now, and I think I want to head out to my woodworking shop. Join us tomorrow for another edition of Weekend Scripter.

We invite you to follow us on Twitter or Facebook. If you have any questions, send email to us at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.

 

Ed Wilson and Craig Liebendorfer, Scripting Guys

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
  • Wait! What's the performance of that where.exe/get-item combo? Don't leave me hanging!

  • You can improve the performance on Get-ChildItem significantly by using -Filter instead of -Include.

  • I think that comparison is not the best one. -Include allows me to search using regex ([a-z]*.doc), and where.exe does not. I think it would be 'fair' to compare where.exe with Get-ChildItem -Filter, which is quicker than -Include, comparable speed with where.exe (at least in my tests), and returns objects.

  • I performed a similar test my laptop's public pictures directory. It has 99 sub-directories containing a total of 6184 images. Pretty obvious which option was the fastest.

    Measure-Command { Get-ChildItem -Path "C:\Users\Public\Pictures" -Filter "*.jpg" -Recurse }

    TotalSeconds      : 1.6251098

    Measure-Command { Where.exe /R C:\Users\Public\Pictures *.jpg }

    TotalSeconds      : 0.3622698

    Measure-Command { [System.IO.Directory]::GetFiles("C:\Users\Public\Pictures","*.jpg","AllDirectories") }

    TotalSeconds      : 0.1367007

  • This is even faster. In my tests, it's at least twice as fast as using Where.exe.

    [System.IO.Directory]::GetFiles("C:\Path\To\Pictures\Here","*.jpg","AllDirectories")

  • I perfomed this test to obtain the full name for all the images in my public user's pictures directory. It has 99 subdirectories containing a total of 6,184 photos.

    (Measure-Command { Where.exe /R "C:\Users\Public\Pictures" "*.jpg" }).TotalSeconds

    0.412819

    (Measure-Command { Get-ChildItem -Path "C:\Users\Public\Pictures" -Filter "*.jpg" -Recurse | % {$_.FullName} }).TotalSeconds

    2.6857958

    (Measure-Command { Get-ChildItem -Path "C:\Users\Public\Pictures" -Include "*.jpg" -Recurse | % {$_.FullName} }).TotalSeconds

    3.9768049

    (Measure-Command { [System.IO.Directory]::GetFiles("C:\Users\Public\Pictures","*.jpg","AllDirectories") }).TotalSeconds

    0.1368741

    Don't you suppose Get-ChildItem is slower because it's enumerating a lot more information, whereas the other two methods are strictly obtaining just the full name of the files?

  • Well, if you need a strings only, and you can not use live object - than sure, anything that asks filesystem for file names only is better than powershell... But what's the point? Get rid of Foreach-Object in your samples, it's not only time-consuming, it makes gci as limited as Ed's where.exe and yours [System.IO.Directory]::GetFiles() ;) You may see it as advantage, but it's huge disadvantage of both approaches for me. Let say I want total size of those pictures...?

  • No arguments here. Personally I'd use Get-ChildItem unless there was a very compelling reason the script had to process things faster. In that case (which was the point of the blog post btw), since PowerShell is based on .NET, I'd search for a .NET solution instead of using something like Where.exe. Especially when using .NET directly is the fastest option. (And if speed is important and you wanted file sizes, there's .NET commands for that too.)

    P.S. I piped the gci to obtain the fullnames to make it an orange to orange comparison. But I took it out and ran the same test:

    (Measure-Command { Get-ChildItem -Path "C:\Users\Public\Pictures" -Filter "*.jpg" -Recurse }).TotalSeconds

    1.6268072

    (Measure-Command { Get-ChildItem -Path "C:\Users\Public\Pictures" -Include "*.jpg" -Recurse }).TotalSeconds

    2.7626639