Hey, Scripting Guy! Can I Sort a Tab-Separated Value File with Windows PowerShell 2.0?

Hey, Scripting Guy! Can I Sort a Tab-Separated Value File with Windows PowerShell 2.0?

  • Comments 1
  • Likes

 

Hey, Scripting Guy! Question

Hey Scripting Guy! I have a bunch of data in a tab-separated value (TSV) file that I need to sort based upon one of the columns. If you could help me it would be great. We are using Windows PowerShell 2.0 if that is of any importance to your solution.

-- ML

Hey, Scripting Guy! Answer

Hello ML,

Microsoft Scripting Guy Ed Wilson here. When your friends and family find out you drink tea and have basically quit drinking coffee, tea bags seem to follow you around. This afternoon I am drinking a cup of cinnamon apple tea. It tastes just right, especially when accompanied by a fresh slice of Fuji apple, and some aged Irish cheddar cheese. I am listening to the James Gang on my Zune, and feeling kind of mellow.

Depending on what you need to accomplish, you can use the Import-Csv cmdlet to read your TSV file and pipe the results to the Sort-Object cmdlet to solve your problem. Windows PowerShell 2.0 added the delimiter parameter to the Import-Csv cmdlet, and it can therefore be used to solve all kinds of interesting dilemmas. To use the Import-Csv cmdlet to read a TSV file, use the “`t” symbol to represent a tab character. Because the Import-Csv cmdlet creates a custom object, you can pipe the results of the operation to other cmdlets. This is seen here:

Import-Csv -Path "C:\ScriptingGuys\Javelin Throw Data.txt" -Delimiter "`t" |
Sort-Object -Property "throw 1" -Descending | Select-Object name, "throw 1"

This week we will be reviewing some of the scripts that were submitted during the recently held 2009 Summer Scripting Games. The description of the 2009 Summer Scripting Games details all of the events. Each of the events was answered by a globally recognized expert in the field. There were some cool prizes and winners were recognized from around the world. Additionally, just like at the "real Olympics" because there was a lot going on, an "if you get lost page" was created. Communication with participants was maintained via Twitter, Facebook, and a special forum. (The special forum has been taken down, but Twitter and Facebook are still used to communicate with Hey, Scripting Guy! fans). We will be focusing on solutions that used Windows PowerShell. We have several good introduction to Windows PowerShell Hey, Scripting Guy! articles that you will find helpful.

The Advanced Event 9 from the 2009 Summer Scripting Games also had you parsing a tab-separated text file. The contribution from Ayowu used Microsoft Office Excel to sort the data. The complete ScriptingGamesAdvancedEvent9.ps1 script is seen here.

ScriptingGamesAdvancedEvent9.ps1

$txtpath = "C:\ScriptingGuys\Javelin Throw Data.txt";
$excelpath = "C:\ScriptingGuys\Adv9_"+([datetime]::Now).tostring("yyyyMMddhhmmss")+".xls";
$xlCellTypeLastCell=11;

$excel = New-Object -comobject Excel.Application;
$excel.Visible = $true;
$excel.DisplayAlerts =$false;
$functions = $excel.WorkSheetfunction;
$book =$excel.Workbooks.Open($txtpath);
$book.SaveAs("$excelpath");
$book.Close();

$book = $excel.Workbooks.Open($excelpath);
$sheet = $book.Worksheets.Item(1);
$used=$sheet.usedRange;
$lastCell=$used.SpecialCells($xlCellTypeLastCell);

2..$lastCell.row |
foreach{$range=$sheet.range("E$_"+":"+"G$_");$sheet.cells.item($_,8) = $functions.max($range)};
2..$lastCell.row |
foreach{$formula = "=RANK(H$_,`$H`$2"+":`$H`$"+[string]$lastCell.row+")" ;  $sheet.range("I$_:I$_").cells.formula = $formula;  }

2..$lastCell.row |
foreach{ $sheet.cells.item($_,9).value2 = $lastCell.row - $sheet.cells.item($_,9).value2  }

$range = $sheet.usedRange
$range2=$sheet.range("I2");
$range.sort($range2, 2)

2..$lastCell.row |
foreach{ $sheet.cells.item($_,9).value2 = $lastCell.row - $sheet.cells.item($_,9).value2  }

$book.Save();
$book.Close();
$excel.Quit();

The first thing that Ayowu does is create three variables. The first one holds the path to the text file, the second one the path and filename for the Microsoft Excel spreadsheet, and the last one is the last cell in the range. The value 11 is the xlCellTypeLastCell enumeration value that is used with the SpecialCells method. It is used to reference the last cell in the range. One thing that was interesting was the way that Ayowu used the system.datetime .NET Framework class to create a date/time stamp that was used in creating the file name. The Now property is a static property which means it is always available from the datetime object and returns a datetime object. When the tostring() method is used, it turns the datetime object into a string representation of the current date and time. When you add the letters yyyy, you get a four-digit representation of the year. The letter m is used for minutes, and a capital M is used for months. Some examples of using this object are seen here:

PS C:\> [datetime]::now

Wednesday, August 12, 2009 5:43:00 PM

PS C:\> [datetime]::now.tostring()
8/12/2009 5:43:02 PM
PS C:\> [datetime]::now.tostring("yymmdd")
094312
PS C:\> [datetime]::now.tostring("yy/mm/dd")
09/43/12
PS C:\> [datetime]::now.tostring("yy/MM/dd")
09/08/12
PS C:\>

The three variables are seen here:

$txtpath = "C:\ScriptingGuys\Javelin Throw Data.txt";
$excelpath = "C:\ScriptingGuys\Adv9_"+([datetime]::Now).tostring("yyyyMMddhhmmss")+".xls";
$xlCellTypeLastCell=11;

The next thing that Ayowu does is create an instance of the Excel Application object, set a couple of properties, and create a WorksheetFunction object by querying the WorksheetFunction property. The workbook is then saved and closed. This is seen here:

$excel = New-Object -comobject Excel.Application;
$excel.Visible = $true;
$excel.DisplayAlerts =$false;
$functions = $excel.WorkSheetfunction;
$book =$excel.Workbooks.Open($txtpath);
$book.SaveAs("$excelpath");
$book.Close();

The workbook is opened, and a Range object is created by querying the UsedRange property from the Worksheet object. An additional range object is obtained by using the SpecialCells method from the range object stored in the $used variable:

$book = $excel.Workbooks.Open($excelpath);
$sheet = $book.Worksheets.Item(1);
$used=$sheet.usedRange;
$lastCell=$used.SpecialCells($xlCellTypeLastCell);

One of the cool things about the script is the way Ayowu calls the max function to get the maximum value of the three throws. Next, the formula object is used to calculate the rankings. This is seen here:

2..$lastCell.row | foreach{$range=$sheet.range("E$_"+":"+"G$_");$sheet.cells.item($_,8) = $functions.max($range)};
2..$lastCell.row | foreach{$formula =  "=RANK(H$_,`$H`$2"+":`$H`$"+[string]$lastCell.row+")" ;  $sheet.range("I$_:I$_").cells.formula = $formula;  }
2..$lastCell.row | foreach{ $sheet.cells.item($_,9).value2 = $lastCell.row - $sheet.cells.item($_,9).value2  }

The values are written to the spreadsheet, and everything is closed. This is seen here:

$range = $sheet.usedRange
$range2=$sheet.range("I2");
$range.sort($range2, 2)

2..$lastCell.row |
foreach{ $sheet.cells.item($_,9).value2 = $lastCell.row - $sheet.cells.item($_,9).value2  }

$book.Save();
$book.Close();
$excel.Quit();

The completed spreadsheet is seen here:

Image of the completed Excel spreadsheet


To make the script a bit easier to read, I took the liberty of cleaning up some of the formatting. I also removed the unnecessary opening and closing of the Microsoft Excel spreadsheet. One of the things I removed was the semicolon at the end of each line. This is not needed because the carriage return is the end of the line. At first I sort of liked it, until I started reformatting the script and noticed that inside the script blocks some commands were separated by semicolons because they were needed there. Then the confusion set in and I removed them. The revised script is seen here. I did not do any cleanup with the function or the sorting because that is the cool thing about the script in the first place:

$txtpath = "C:\ ScriptingGuys\Javelin Throw Data.txt"
$excelpath = "C:\ScriptingGuys\ \Adv9_"+([datetime]::Now).tostring("yyyyMMddhhmmss")+".xls"
$xlCellTypeLastCell=11

$excel = New-Object -comobject Excel.Application
$excel.Visible = $true
$excel.DisplayAlerts =$false
$functions = $excel.WorkSheetfunction
$book =$excel.Workbooks.Open($txtpath)
$sheet = $book.Worksheets.Item(1)
$used=$sheet.usedRange
$lastCell=$used.SpecialCells($xlCellTypeLastCell)

2..$lastCell.row |
 ForEach-Object {
   $range=$sheet.range("E$_"+":"+"G$_")
   $sheet.cells.item($_,8) = $functions.max($range)
   }
2..$lastCell.row |
 ForEach-Object {
   $formula = "=RANK(H$_,`$H`$2"+":`$H`$"+[string]$lastCell.row+")"
   $sheet.range("I$_:I$_").cells.formula = $formula
   }
2..$lastCell.row |
 ForEach-Object {
   $sheet.cells.item($_,9).value2 = $lastCell.row - $sheet.cells.item($_,9).value2 
   }
$range = $sheet.usedRange
$range2=$sheet.range("I2")
$range.sort($range2, 2)
2..$lastCell.row |
 ForEach-Object {
   $sheet.cells.item($_,9).value2 = $lastCell.row - $sheet.cells.item($_,9).value2 
   }
  
$book.SaveAs("$excelpath")
$book.Close()
$excel.Quit()

ML, you now have two different ways to work with sorting tab-separated value files. Ayowu, thank you for submitting a cool script that uses Microsoft Excel to work with tab values.

If you want to be the first to know what is happening on the Script Center, follow us on Twitter or on Facebook. If you need assistance with a script, you can post questions on the Official Scripting Guys Forum, or send e-mail to scripter@microsoft.com. The 2009 Summer Scripting Games wrap-up will continue tomorrow. Until then, peace.

Ed Wilson and Craig Liebendorfer, Scripting Guys 

 

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment