Use a PowerShell Cmdlet to Count Files, Words, and Lines

Use a PowerShell Cmdlet to Count Files, Words, and Lines

  • Comments 16
  • Likes

Summary: Learn how to use a powerful Windows PowerShell cmdlet to count words and lines in files, or to count files.

 

Microsoft Scripting Guy Ed Wilson here. The weekend is halfway over in Charlotte, North Carolina. For my friends in Australia, the weekend is already over, and they are on their way to work. Of course they get to start their weekend earlier than I do. The ideal thing to do is to be in Australia to start the weekend, and then pop back to Charlotte to conclude the weekend. Yes, I have strange thoughts on the weekend. For example, I was on the treadmill earlier, and I was thinking about my favorite Windows PowerShell cmdlet. Anyway, I called the Scripting Wife while I was cooling down. She was downstairs and it is easier to call her than to go down there. Cell phones make great intercoms.

“What are you doing?” she asked as she answered her Windows 7 phone.

“I just finished running on the treadmill, and I am now cooling down. I was wondering what your favorite Windows PowerShell cmdlet is.”

“You have got to be kidding. Why would I have a favorite Windows PowerShell cmdlet?” she asked.

“Well, I was thinking about my favorite Windows PowerShell cmdlet while I was running, and I realized I did not know what yours was,” I said.

“Get-Real,” she said as she hung up.

At times, I think that the Scripting Wife seems to believe I am a nerd. I am not positive of this and am somewhat afraid to ask, but she seems to give off the “nerd alert” vibe when I enter a room or when I call her on her cell phone from upstairs and ask her about her favorite Windows PowerShell cmdlet.

Anyway, I will share my favorite cmdlet—it is the Measure-Object cmdlet. If I did not have the Measure-Object cmdlet, I would need to count the files in a folder manually. This is shown here:

$i=0

Get-ChildItem -Path c:\fso -Recurse -Force |

foreach-object { $i++ }

$i

 

Using the Measure-Object cmdlet, it is easy to count the files. I merely need to use the following steps.

  1. Use the Get-Childitem cmdlet to return a listing of fileinfo objects. Use the recurse switch to cause the cmdlet to work through subfolders. The force switch is used to return any hidden or system files. Pass the path to count to the path parameter.
  2. Pipe the fileinfo objects from step one to the Measure-Object cmdlet

An example of using this command to count the files in the c:\fso folder is shown here:

Get-ChildItem -Recurse -force | Measure-Object

The command and associated output are shown in the following figure. Note that I ran the command twice: the first time without the force switched parameter, and the second time using it.

Image of command and associated output

But the Measure-Object cmdlet does more than just count the number of files in a folder. It can also tell me information about a text file. A sample file is shown in the following figure.

Image of sample file

If I want to know how many lines are contained in the file, I use the Measure-Object cmdlet with the line switch. This command is shown here:

Get-Content C:\fso\a.txt | Measure-Object –Line

If I need to know the number of characters, I use the character switch:

Get-Content C:\fso\a.txt | Measure-Object -Character

There is also a words switched parameter that will return the number of words in the text file. It is used similarly to the character or line switched parameter. The command is shown here:

Get-Content C:\fso\a.txt | Measure-Object –Word

In the following figure, I use the Measure-Object cmdlet to count lines; then lines and characters; and finally lines, characters, and words. These commands illustrate combining the switches to return specific information.

Image of using Measure-Object to count

One really cool thing I can do with the Measure-Object cmdlet is to measure specific properties of the piped objects. For example, I can use the Get-ChildItem cmdlet to return fileinfo objects for all the text files in the folder. I can examine the length property and find out the minimum length of the files in the folder, the maximum length, the average size, and the total length of all files in the folder. This command and associated output are shown here:

PS C:\fso> Get-ChildItem -Filter *.txt | Measure-Object -Property length -Maximum -Minimum -Average -Sum

 

Count    : 66

Average  : 305903.833333333

Sum      : 20189653

Maximum  : 12534760

Minimum  : 0

Property : Length

 

If I want to, I can pipe the output to a table and create my own custom headings and output. In the following example, I display the average size of the files in kilobytes. I also define the format to omit decimal places:

PS C:\fso> Get-ChildItem -Filter *.txt | Measure-Object -Property length -Maximum -Minimum -Average -Sum | ft count, @{"Label"="average size(KB)";"Expression"={($_.average/1KB).tostring(0)}}

 

                                                      Count                      Average size(KB)

                                                      -----                        ----------------

                                                         66                        299

 

Well, that is about all there to say for now. The Measure-Object cmdlet is one of my favorite cmdlets because it is easy to use and extremely flexible—an unbeatable combination in my book. What is your favorite cmdlet? Add a comment below and let me know. Until tomorrow, see ya.

I invite you to follow me on Twitter and Facebook. If you have any questions, send email to me at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.

Ed Wilson, Microsoft Scripting Guy

 

 

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
  • In addition to text files, you can use the Measure-Object cmdlet against Microsoft Word documents as well... For example: Get-Content "c:\myWordDocument.docx" | measure-object -line -word -character

  • Hi Ed,

    this is cool!

    Measure-Object takes away some tedious counting tasks from us I should remember to use it more often.

    Sometimes I'm still doing these things manually and this is really just an old habit that I should overcome now :-)

    Something completely different:

    Why does the new layout of this blog require scroll down in the browser so far?

    I really thought that the blog was empty last friday and didn't recognize the enormous length of the scroll bar :-( Why is there so much empty, unused space in this page?

    Klaus

  • @Klaus Measure-Object is cool. I am not sure what has changed about the blog layout. Let me check on this.

  • @Craig Lussier I had not tried to use Measure-Object cmdlet with Microsoft Word documents because when I used Get-Content to attempt to read a Microsoft Word .DOC file, it made lots of beeping noises, displayed gibberish on the screen and literally locked up Windows PowerShell. Based upon that I had not attempted it again. However, I just tried Get-Content on a file and piped it to Measure-Object and it appeared to work. But when I looked at the numbers, there was a discrepancy. Measure-Object reported 1002 words on my .DOC file, and when I opened the word document in Microsoft Word, it said there were 1400 words ... so it was a huge difference. I wrote a script that uses the COM automation model to actually open each WORD document and get the statics ... and that is what I will stick with for now. Thank you for your comment because it caused me to reasses my assumptions.

  • @Ed - I should have taken a look at the Word Count in Word before posting - the numbers are off indeed. I made assumptions myself with the docx file and I wonder if it is taking into account any additional markup in the background of the document in addition to the text you see when creating/editing a document in Word. I will second your comment with reference to using COM to get the stats of a Word document - for anyone reading this the methods in this post are absolutely accurate for .txt files, however with Office files, please use Office Automation (COM) to ensure you get the correct numbers. Good conversation. Ed if you do find anything further on this please write a blog post - it will be interesting. Thanks for your comment Ed. Cheers. C.

  • Why not port, Perl, Sed, Awk and bash into the powershell, seems to me that as a Unix guy you guys have attempted to reinvent the wheel, make it too complex.. I have hundreds of shell scripts, and am being forced to migrate to a Microsoft platform on a key piece of equipment. I cringe when I see how terribly complex your scripting is. Simple is better.

    In unix there is a simple command # wc  will count the words in a file.

    Seriously..

  • @JR you do not need to port the Unix type of utilities to Powershell. They are already available, all you need to do is to enable services for unix on your operating system (On Windows 7 it is called Subsystem for Unix based applications). One thing to keep in mind, all these Unix type of utilities are text based, they do not return objects like the PowerShell commands do. Therefore you will have to use regular expressions to parse the output from the commands.

  • count files in a dir:

    (dir).count

    count lines in a file:

    (type .\file_to_cout_lines).count

  • How can I use some of the above and get the line counts for each file within a directory. Using -Line with the *.txt filter just gives me the grand total. If I have 30 files, I need 30 line count values, say.

  • Get-ChildItem -Filter *.txt | ForEach-Object{'{0,-20}{1,10}' -f $_.Name,(Get-Content $_).Count }

  • This is exactly what i was looking for. thanks for sharing!

  • To count all lines (of code) of files in a directory a double pipe may be used:

    Get-ChildItem -Filter *.bas | Get-Content | Measure-Object -Line

  • These are all well and good... but I need to count the lines in files that are MILLIONS of lines long.

    These methods take forever!

  • Thank you, Ed the Microsoft Scripting Guy, and thank you Markus, as well. I was able to calculate the sum of the number of words in the text files in a given folder using a modified version of Markus's script:

    Get-ChildItem -Filter *.txt | Get-Content | Measure-Object -Word

    And if we want just the number on its own, instead of having it in an ASCII art table, then we can pipe it again and extract just the number using Select-Object:

    Get-ChildItem -Filter *.txt | Get-Content | Measure-Object -Word | Select-Object -expand Words

  • Can I use for multiple computer. for example I want to count folder under sysvol\domain\policy on my all DCs servers.