Hey, Scripting Guy! Question

Hey, Scripting Guy! How can I determine the 20 largest files on a computer?

-- PW

SpacerHey, Scripting Guy! AnswerScript Center

Hey, PW. Having recently returned from a vacation in Europe, the Scripting Guy who writes this column now understands that there are different ways to do things. Some of these would be difficult to adjust to on a regular basis: for example, ordering a soft drink in a restaurant and getting a teeny-tiny little glass with no ice and without free refills still seems like a crime against humanity. Other things - like getting 6-8 weeks of paid vacation each year - would be a bit easier to accept. The point is not that any of these are good (crepes smothered in dark chocolate sauce) or bad (pastries filled with - and we are not making this up - snails). The point is that it’s OK to do and to try different things.

Which is just a way of rationalizing the fact that, unlike our usual approach, this time we aren’t going to provide you with a solution that’s built into the operating system. Determining the 20 largest files on a computer is theoretically possible using nothing more than WMI; however, the script would be a bit cumbersome to write, and might take hundreds of thousands of years to run. (Which, coincidentally enough, is the same amount of time that elapses between the moment you finish your meal and the moment your Parisian waiter brings you the check.) If WMI represented the only way to find the 20 largest files we’d bite the old bullet and do that. But there’s a much better and much easier way to do this: download, install, and use Log Parser 2.2.

If you aren’t familiar with Log Parser 2.2, you should take a look at this Tales from the Script column. Log Parser is a nifty little utility that, as the name implies, makes it quick and easy to parse plain-text log files. However, Log Parser also makes it quick and easy to parse event logs, the file system, the registry, even Active Directory. After you’ve installed Log Parser, retrieving (and sorting) the 20 largest files on a computer requires just a couple minutes of processing time, and involves a script no more complicated than this:

Set objLogParser = CreateObject("MSUtil.LogQuery")
Set objInputFormat = CreateObject("MSUtil.LogQuery.FileSystemInputFormat")
objInputFormat.Recurse = -1

Set objOutputFormat = CreateObject("MSUtil.LogQuery.NativeOutputFormat")
objOutputFormat.rtp = -1

strQuery = "SELECT Top 20 Path, Size FROM 'C:\*.*, D:\*.*' ORDER BY Size DESC"
objLogParser.ExecuteBatch strQuery, objInputFormat, objOutputFormat

We won’t explain this script in great detail today; for that level of information you should take a look at the Tales from the Script column. We will note, however, that the script begins by creating an instance of the MSUtil.LogQuery object, then uses this line of code to indicate that we want to work with items in the file system:

Set objInputFormat = CreateObject("MSUtil.LogQuery.FileSystemInputFormat")

We then set the value of the Recurse property to -1; that tells Log Parser that we want to recursively search through all the folders in the specified path. (And, no, we haven’t specified the path yet; we’ll do that in a minute.)

Those initial lines of code set up the input parameters; we then use the next two lines of code to set up the output parameters:

Set objOutputFormat = CreateObject("MSUtil.LogQuery.NativeOutputFormat")
objOutputFormat.rtp = -1

Again, without going into any great detail, the first line tells Log Parser to output data to the command window; setting the rtp property to -1 tells Log Parser to write all the data at once, without pausing after each screen and waiting for the user to press a key to continue. With Log Parser you aren’t limited to outputting data to the command window, but - for now - this seemed like the easiest and most intuitive approach.

With the input and output parameters taken care of we next create our Log Parser query:

strQuery = "SELECT Top 20 Path, Size FROM 'C:\*.*, D:\*.*' ORDER BY Size DESC"

If you have some experience at writing SQL queries this particular line of code should be pretty straightforward. All we’re doing here is asking for the 20 largest files; to do that we request the “Top 20” files, sorted by size in descending order. When we make that request Log Parser will grab all the files and sort them in descending order; because we requested only the first (top) 20, however, the only data that will be displayed onscreen will be the 20 largest files. What if we wanted the 50 largest files? Then we’d ask for the Top 50 files:

strQuery = "SELECT Top 50 Path, Size FROM 'C:\*.*, D:\*.*' ORDER BY Size DESC"

What if we wanted the 20 smallest files? In that case, we’d sort the files in ascending order, which would put the smallest files at the top of the list:

strQuery = "SELECT Top 20 Path, Size FROM 'C:\*.*, D:\*.*' ORDER BY Size ASC"

See? We told you it was pretty straightforward.

As you can see, we’re asking only for the file path and the file size; needless to say, we could get other information about the files as well. (See the Log Parser documentation for more information.) Also, note that we need to indicate each of the drives that we want to search, separating the drives by commas:

strQuery = "SELECT Top 20 Path, Size FROM 'C:\*.*, D:\*.*' ORDER BY Size DESC"

By the way, *.* is a standard file system wildcard meaning “all the files, regardless of file name or extension.” If we were interested in only the 20 largest Microsoft Word documents we would modify our query like so:

strQuery = "SELECT Top 20 Path, Size FROM 'C:\*.doc, D:\*.doc' ORDER BY Size DESC"

Save the script as a .vbs file and have at it; you’ll be surprised at how quickly you’ll get back information regarding the 20 largest files on the computer.

Although probably not as surprised as we were to discover that a traditional English breakfast includes both baked beans and a grilled tomato. But hey, food is food, right?

Unless, of course, there happens to be snails in it.