Use PowerShell to Parse an XML File and Sort the Data

Use PowerShell to Parse an XML File and Sort the Data

  • Comments 6
  • Likes

Summary: In this blog, the Scripting Wife learns how to use Windows PowerShell to parse her books XML files and to find authors and sort titles.

Microsoft Scripting Guy, Ed Wilson, is here. Today the registration opens for the 2012 Scripting Games. You can go to the 2012 Scripting Games site on PoshCode, click Log On, choose your authentication mechanism, and fill out your user name and email address. Make sure your user name is the name you want to appear on the leaderboard, and on your 2012 Scripting Games certificate. Also, make sure your email address is correct because it is used to notify you about prizes.

Yesterday, the Scripting Wife asked me to help her with exploring an XML file. Last night, she told me that she wants me to help her look at the XML file she got by exporting her book database so that she can find titles of books and things like that. I think I will call her, and see if she is ready…

“Scripting Wife?” I call in my nicest voice.

All of a sudden I look up, and she is here.

“You don’t have to yell. You could simply have called me on my Windows 7 phone,” she replied.

“I didn’t yell—and besides, I was not sure you had your Windows 7 phone with you,” I apologized.

“I always have it with me,” she said rather curtly. “So are you going to show me how to find titles in my XML file?”

“Yep. Why don’t you sit over here next to the computer, and first read your XML file into a variable. Remember to use the [XML] symbol to make sure that you get an XML document,” I said.

The Scripting Wife thought for a minute, and then typed the following command.

[xml]$books = Get-Content C:\fso\books.xml

“Now, remember that all your book data was in the book node. So why don’t you retrieve that, pipe it to the Foreach-Object cmdlet, and then retrieve the Title property,” I said.

The Scripting Wife thought for a few seconds and began to type. Within a minute or so, she had composed the command that is shown here. (The percentage sign (%) is an alias for the Foreach-Object cmdlet.)

$books.bookinfo.booklist.book | % {$_.mainsection.title}

The two commands and the output associated with the commands are shown in the image that follows.

Image of command output

“That is pretty cool, but I want to be able to find out how many books I have based on subject,” she asked.

“Well, that is pretty easy. Take your book element, pipe it to the Foreach-Object cmdlet, then pipe that to the Sort-Object cmdlet, and choose the DisplayName property. Then group it by DisplayName. Use the NoElement switched parameter to keep from displaying a bunch of extra information,” I said.

She thought for a second, and then used the Up arrow to retrieve her previous command. She then modified it until the following command appeared.

$books.bookinfo.booklist.book | % {$_.subjects.subject} | sort displayname | group displayname -NoElement

She turned the monitor to me, and the output that is shown in the image that follows was the result.

Image of command output

“That was pretty complicated. Can’t we do something easier? For instance, I want to find all the Perry Mason novels,” she requested.

“That one really is easy. Retrieve your command that lists all the titles. Because the titles are all strings, pipe that to the Where-Object and use the Match operator to look for Mason,” I said.

This time the Scripting Wife did not hesitate. She retrieved her next to last command by pressing the Up arrow twice. Then she added a pipe character, used the question mark (?) as an alias for the Where-Object. She added the Match operator and looked for the word Mason. Her completed command is shown here.

$books.bookinfo.booklist.book | % {$_.mainsection.title} | ? { $_ -match 'mason'}

“This is more like it. Here take a look,” she said as she turned the monitor towards me.

Image of command output

“That is all I really wanted to see right now. I think I am going to head to the mall with a couple of my friends. You have been keeping me too busy around here, and I need a break,” she said as she jumped up and headed towards the door.

“Don’t forget…registration for the 2012 Scripting Games is open,” I said.

“I will register tomorrow,” she said.

And she was gone.

Join us tomorrow when I imagine the Scripting Wife will register for the 2012 Scripting Games.

I invite you to follow me on Twitter and Facebook. If you have any questions, send email to me at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.

Ed Wilson, Microsoft Scripting Guy 

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
  • Hi Teresa, Hi Ed,

    I just visited the PoshCode site and successfully logged into my old account.

    In fact I looked a bit over my old scripts and some other scripts just to bring back old memories :-) Amazing that this has been only 11 months ago ...

    Back to this blog entry: filtering, sorting, grouping and searching XML content is something that we can do with Powershell-on-board-tools and that's really a help if we can't use other specialized tools. I don't know much about the upcoming PS V3 CTP2, so I only can hope, that the XML support has been one of the subject that might have been extended or improved in the new version. Up to now only Select-Xml and the specialized Get-WinEvent cmdlet support XPATH parameters which should be the preferred way of selecting XML nodes. Other XML topics like XSLT stylesheet transformations are missing competely in Powershell v1 and v2 but as usual could be accomplished by means of the underlying .Net framework v2.0 ( or 4.0 later on )

    Again: It's good to have a basic XML toolkit around in Powershell ...

    Klaus

  • One thing, I forgot to ask yesterday:

    "Could you provide a link to the books.xml file or at least publish a listing of the content, please!?"

    Klaus

  • The following is a great place for XPath novices to start learning.  The turorial has online test faculity so you can try as you go.  And oh! It's Free!

    www.w3schools.com/xpath

    YouTube is now loaded with many XPath and XSLT videos.

  • Hi Scripting Guy

    You saved my bacon today!! Thanks for the tutorial.

  • This is some good stuff! :) Thank you!

  • I've searched for hours .... is there nowhere on the planet that the source books.xml file can be found? There IS one on the MSDN site, but it is not the once used here.