Learn about Windows PowerShell
(Note: These solutions were written for Event 1.)
In the 100-meter event, you will be given the finish times of our runners. You will be asked to sort them and rank the gold, silver, and bronze winners.
Steven Murawski is the Director of PowerShellCommunity.Org. His podcast can be heard on mindofroot.com. He maintains a blog at usepowershell.com. Steve agreed to provide a VBScript solution and a Windows PowerShell solution for the first event.
My first thought after reviewing this problem was that this should not be too difficult. Reading files and looking for patterns in data are common tasks for IT pros of all persuasions. Here is the solution written in VBScript.
BeginnerEvent1Solution.vbs
Const ForReading = 1Set objFS = CreateObject("Scripting.FileSystemObject")Set objFile = objFS.OpenTextFile("C:\Scripts\100 Meter Event.txt", ForReading)
Set myRegExp = New RegExpmyRegExp.IgnoreCase = TruemyRegExp.Pattern = "^(\w+?,\ .+?)\t(.+?)\s+(\d+\.\d+)$"
Const adVarChar = 200Const MaxCharacters = 255Const adFldIsNullable = 32Const adDouble = 5Set DataList = CreateObject("ADOR.Recordset")DataList.Fields.Append "Name", adVarChar, MaxCharacters, adFldIsNullableDataList.Fields.Append "Country", adVarChar, MaxCharacters, adFldIsNullableDataList.Fields.Append "Time", adDouble, , adFldIsNullableDataList.OpenDo Until objFile.AtEndOfStream strLine = objFile.ReadLine Set colMatches = myRegExp.Execute(strLine) For Each Match In colMatches DataList.AddNew DataList("Name") = Match.SubMatches(0) DataList("Country") = Match.SubMatches(1) DataList("Time") = Match.SubMatches(2) DataList.Update NextLoop'close fileobjFile.CloseDataList.Sort = "Time"Wscript.Echo "Gold Medal: " & DataList.Fields.Item("Name") & " " & DataList.Fields.Item("Country") & " " & DataList.Fields.Item("Time")DataList.MoveNextWscript.Echo "Silver Medal: " & DataList.Fields.Item("Name") & " " & DataList.Fields.Item("Country") & " " & DataList.Fields.Item("Time")DataList.MoveNextWscript.Echo "Bronze Medal: " & DataList.Fields.Item("Name") & " " & DataList.Fields.Item("Country") & " " & DataList.Fields.Item("Time")
Here is the solution written in Windows PowerShell.
BeginnerEvent1Solution.ps1
[regex]$regex = "^(?<Name>\w+?,\ .+?)\t(?<Country>.+?)\s+(?<Time>\d+\.\d+)$"$Name = @{Name='Name';Expression={$_.groups["Name"].Value}}$Country = @{Name='Country';Expression={$_.groups["Country"].Value}}$Time = @{Name='Time';Expression={$_.groups["Time"].Value}}$medals = 'Gold', 'Silver', 'Bronze'$File = 'c:\scripts\100 Meter Event.txt'$finalist = get-content $file | ForEach-Object { $regex.Match($_) } | Where-Object {$_.Success} | Select-Object -Property $Name, $Country, $Time | Sort-Object -Property Time | Select-Object -First $Medals.countfor ($i = 0; $i -lt $Medals.count; $i++){ Write-Host "$($Medals[$i]) Medal: $($Finalist[$i].Name) $($Finalist[$i].Country) $($Finalist[$i].Time)"}
Here is Steven Murawski’s description of the approach he took to unraveling the mysteries of the 100-meter event.
The first thing I did was take a look at the data file and look for patterns in how the data was stored there and in the 100 Meter Event.txt file. The text file contained three categories of information—Name, Country, and Time. Names were stored as “last name, first name”. After the name, there was at least one empty space and then the country. Countries contained both single word names and compound word names. After that, at least one empty space was followed by the time of the runner.
Because there was a pattern to the data, but not a unique breaking character (spaces were in the middle of the country names, there was a comma in the name but nowhere else, and some of the spacing characters were tabs and others were single spaces), I decided to use a regular expression to parse the text.
To read in the file, I created an instance of the FileSystemObject object and used that to open the text file. This is seen here:
After opening the file, I created my regular expression object:
(For more information on regular expressions, check out MSDN.)
My regular expression will take the first set of letters (or numbers) up to a comma, and a second set of letters (or numbers) up to the first white space and create the first group (the name) from that. The second group (the country) will be any and all characters after the white space following the name up to the white space preceding the time. Finally, the remaining digits and period character are saved as the third group.
To store these values, I created an in-memory dataset and opened it up:
Set DataList = CreateObject("ADOR.Recordset")DataList.Fields.Append "Name", adVarChar, MaxCharacters, adFldIsNullableDataList.Fields.Append "Country", adVarChar, MaxCharacters, adFldIsNullableDataList.Fields.Append "Time", adDouble, , adFldIsNullableDataList.Open
Next, I looped through the text file, reading each line and matching it against my regular expression. If there was a match, I added the result to the dataset:
Do Until objFile.AtEndOfStream strLine = objFile.ReadLine Set colMatches = myRegExp.Execute(strLine) For Each Match In colMatches DataList.AddNew DataList("Name") = Match.SubMatches(0) DataList("Country") = Match.SubMatches(1) DataList("Time") = Match.SubMatches(2) DataList.Update NextLoop
After creating the dataset, I sorted it based on time:
DataList.Sort = "Time"
Then it was just a matter of returning the top three results and displaying them on the screen:
Wscript.Echo "Gold Medal: " & DataList.Fields.Item("Name") & " " & DataList.Fields.Item("Country") & " " & DataList.Fields.Item("Time")DataList.MoveNextWscript.Echo "Silver Medal: " & DataList.Fields.Item("Name") & " " & DataList.Fields.Item("Country") & " " & DataList.Fields.Item("Time")DataList.MoveNextWscript.Echo "Bronze Medal: " & DataList.Fields.Item("Name") & " " & DataList.Fields.Item("Country") & " " & DataList.Fields.Item("Time")
Running the VBScript script should give you the following:
Here is the output I obtained when I ran the Windows PowerShell script:
The 100-meter event is the shortest outdoor distance. In this event, you will be required to read a text file and determine the shortest lines of text that it contains.
Kirk Munro is a PowerShell Solutions Architect and Windows PowerShell MVP. He maintains the Poshoholic.com Web site and tweets on Twitter.com/poshoholic.
Sprint. That is the name of the game for me these days. Lots to do, little time to do it. When the Scripting Guys asked me to provide a solution for the Advanced division of Event 1, I thought it was very appropriate to my work because it’s all about finding the shortest paths to get the job done. In addition, I thought it fitting to provide a nice, short solution to do it too, so let’s get started. First of all, here is the solution.
AdvancedEvent1Solution.ps1
param( $LiteralPath = '.\Personal Information Cards_ADV1.txt', $Count = 1)Get-Content -LiteralPath $LiteralPath ` | Where-Object {$_.Trim()} ` | Sort-Object {$_.Trim().Length} ` | Select-Object -First $Count
At first glance, the problem is straightforward: Read in the contents of a file and determining what the three shortest lines of text contain. When you look a little more closely at the file, though, you’ll quickly realize that you can’t always count on things working as you expect. Some of the lines start with white space, and other lines contain nothing but white space. I do not consider white space to be text, so white space should be ignored—we will have to remember that when writing the solution.
Now that we’ve looked at the file we’re dealing with, the next step is to break down the problem into a set of manageable tasks. For this event, I came up with the following tasks:
· Read the contents of the text file.
· Filter out any lines that contain nothing but whitespace.
· Sort the remaining lines by the length of the text (remembering to strip the white space when determining the length).
· Show the first three lines (these will be the shortest lines after they are sorted).
This is a manageable list of tasks, so now we need to figure out what Windows PowerShell offers to solve each task. Fortunately, Windows PowerShell comes with an appropriate cmdlet for each of these tasks, making things pretty easy for us. Here are the cmdlets we will be able to use:
· Get-Content allows you to read the contents of files.
· Where-Object allows you to filter objects.
· Sort-Object allows you to sort objects.
· Select-Object allows you to pick which objects you want.
The only other detail we need to know is that you can use the trim method on strings to eliminate white space and the length property on strings to determine the length. At this point, we have enough details to create our script in our favorite script editor (mine is PowerGUI). Here is what my Get-ShortestLine.ps1 script looks like:
I threw a few parameters into the mix to give the script some flexibility, but basically it’s as simple as piping together the cmdlets chosen for each task and using trim where appropriate to prevent white space from skewing our results. Below you can see the results when I ran my script:
I hope that through this solution I have given you a few Windows PowerShell tips and tricks that you can use in your own scripts.
Stephanie Peters has worked for Microsoft for more than 10 years. She is a senior premier field engineer and a veteran scripting trainer. She writes a very popular blog on TechNet, Something About Scripting.
I was never much of a sprinter, actually. I am more of a survey the track, determine the strategy, and work your paces kind of girl. In this event, if you were to try to sprint off writing code at the starting gun, you might find yourself in a bit of trouble. There are a few things that need to be worked out before getting out of the blocks. OK, that’s enough with the play-on-the-track theme.
Seriously, though, there were a number of steps I had to work out before I hit my stride on this one. (Oops, sorry—will not do it again.)
The scenario sounds simple enough:
“In this event, you will be required to read the Personal Information Cards_ADV1.txt file and determine the shortest lines of text that the file contains. You will need to display to the console, the first three shortest lines.” Here is my solution to this event:
AdvancedEvent1Solution.vbs
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' '' Adv_1.vbs '' written by Stephanie Peters, Microsoft PFE '' for the 2009 Summer Scripting Games 2009 '' '' The goal of this script is to read a particular text file '' (Personal Information Cards_ADV1.txt) and display the first three '' shortest lines from that file. '' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''Option Explicit' Constants specify parameters required in the scenarioConst FILE_NAME = "Personal Information Cards_ADV1.txt"Const N_SHORTEST_LINES = 3' Parameter VariablesDim SKIP_EMPTY_LINESDim SKIP_WHITESPACE_ONLYDim TRIM_LINESDim UNIQUE' Script VariablesDim objLengthDictionary, objDupeDictionary Dim objFSO, objFile Dim strCurrentPath, strFilePath Dim strCurrentLine, intCurrentLineLength, strCurrentLineNumber, strPadDim blnSkipThisLineDim intMaxLineLength, intMinLineLengthDim intFileLineCounter, intLengthCheckCounter, intShortLineCounterDim aryCurrentLines' Initialize ParametersSKIP_EMPTY_LINES = Wscript.Arguments.Named.Exists("SkipEmpty")SKIP_WHITESPACE_ONLY = Wscript.Arguments.Named.Exists("SkipWhiteLines")TRIM_LINES = Wscript.Arguments.Named.Exists("Trim")UNIQUE = Wscript.Arguments.Named.Exists("Unique")' Set up objects for later useSet objLengthDictionary = CreateObject("Scripting.Dictionary")Set objDupeDictionary = CreateObject("Scripting.Dictionary")Set objFSO = CreateObject("Scripting.FileSystemObject")' Locate text file in the same folder with the currently running scriptstrCurrentPath = Replace(Wscript.ScriptFullName,Wscript.ScriptName,"")strFilePath = strCurrentPath & FILE_NAME' Open the text file for readingSet objFile = objFSO.OpenTextFile(strFilePath)' Read the file one line at a timeDo Until objFile.AtEndOfStream ' Keep track of which line we are currently reading intFileLineCounter = intFileLineCounter + 1 strCurrentLine = objFile.ReadLine ' Reset blnSkipThisLine to False for new line blnSkipThisLine = False ' Use Logical implication to determine whether ' to skip lines based on white space whitespace If _Not (SKIP_EMPTY_LINES Imp CBool(Len(strCurrentLine))) Then blnSkipThisLine = True ElseIf _Not (SKIP_WHITESPACE_ONLY Imp CBool(Len(Trim(strCurrentLine)))) Then blnSkipThisLine = True Else ' Trim leading and trailing whitespace if specified If TRIM_LINES Then strCurrentLine = Trim(strCurrentLine) End If ' Determine whether to skip the line based on uniqueness' if specified If UNIQUE Then ' If the line is in objDupeDictionary, then it has' already been encountered and is a duplicate. If objDupeDictionary.Exists(strCurrentLine) Then blnSkipThisLine = True Else ' This item is not a duplicate and needs to be' added to the running list. objDupeDictionary(strCurrentLine) = Null End If End If End If ' If a line hasn't been marked to skip, then process it If Not blnSkipThisLine Then intCurrentLineLength = Len(strCurrentLine) ' Keep track of max and min line lengths for later use. If intCurrentLineLength > intMaxLineLength Then intMaxLineLength = intCurrentLineLength End If If intCurrentLineLength < intMinLineLength Then intMinLineLength = intCurrentLineLength End If ' Pad the leading zeros on the text file line number so the' display will be aligned. strPad = String(3-Len(intFileLineCounter), "0") strCurrentLineNumber = strPad & intFileLineCounter ' Prepend the line number to the line and format it for output. strCurrentLine = "Line " & strCurrentLineNumber & ": """ & _ strCurrentLine & """" ' If this line is a new length, then add it to the dictionary, ' otherwise append it to the existing item for this same length. If objLengthDictionary.Exists(intCurrentLineLength) Then objLengthDictionary(intCurrentLineLength) = _objLengthDictionary(intCurrentLineLength) & _vbNewLine & strCurrentLine Else objLengthDictionary(intCurrentLineLength) = strCurrentLine End If End IfLoop' intShortLineCounter tracks which of nth shortest line we're looking for.' We start at 1 because the first line we're looking for is the 1st shortest ' line.intShortLineCounter = 1For intLengthCheckCounter = intMinLineLength To intMaxLineLength ' If this length is found in the Dictionary, then we’ll use it. If objLengthDictionary.Exists(intLengthCheckCounter) Then ' write the header for this particular grouping wscript.echo wscript.echo "Number " & intShortLineCounter & _ " shortest line(s) - Length=" & intLengthCheckCounter & ":" wscript.echo "====================================" ' Empty lines must be handled separately If objLengthDictionary(intLengthCheckCounter)="" Then wscript.echo objLengthDictionary(intLengthCheckCounter) intShortLineCounter = intShortLineCounter + 1 Else ' In the event of a tie, the dictionary value will ' contain multiple values, so we need to separate them ' using the Split function aryCurrentLines = _ Split(objLengthDictionary(intLengthCheckCounter), _ vbNewLine) For Each strCurrentLine in aryCurrentLines wscript.echo strCurrentLine intShortLineCounter = intShortLineCounter + 1 Next End If ' Once the required number of shortest lines have been' discovered, then we're done. If intShortLineCounter > N_SHORTEST_LINES Then Exit For End If End IfNextwscript.echowscript.echo
The preliminary questions are way too simple. In fact, these are some of the vaguest specs I have ever seen—and I have seen all sorts of specs. My mind raced to clarifying questions:
· Should empty lines be displayed?
· How about lines with only white space characters?
· Should leading and trailing white space be trimmed from the lines before calculating the length?
· If there are two identical lines, should they be handled separately or aggregated into unique lines only?
(Scripting Guys Note: The vague specs were intentional. It is part of what makes an advanced Scripting Games scenario more fun. It also gives you the flexibility to make choices and decisions for yourself. Because you are helping to decide upon the specifications for the scenario, you are the one who is in charge of your own destiny, and ultimately the one who will decide if you met or exceeded your personal goal.)
Of course, I didn’t have much luck getting these questions answered—a situation I’ve been in before. That being the case, I usually find it best to write into the script the flexibility to handle as many scenarios as reasonably possible and let the user choose how he or she would like to go forward. I decided early on to allow the user to specify how to handle these options.
Some other specification questions I had to answer for myself included the following:
· How should I handle a tie?
· What is the best way to display the lines in a way that makes the essence of the result clear?
You’ll see how I handled those a little later. After I had decided these particulars, I worked out a strategy I could execute. The first idea I had was that I should sort the lines based on length. However, the scenario makes no mention of having to do a full sort, so I decided I would do the following:
· Read each line of the file one by one.
· Make a decision as to if/how the line should be processed.
· Store the line in a Scripting.Dictionary object—keyed on the length of the line.
· Iterate possible lengths, starting with the shortest, and retrieve them from the Dictionary object until the required first three shortest line lengths is met.
· Display a header for each line length.
· Display each line that corresponds to that length enclosed in double quotation marks (to make potential white space apparent) and preceded by its line number in the file.
With strategy in hand, it’s off to the, uh, scripting. For this script, we are following the four-part scripting model that was introduced in the Microsoft Press book, Microsoft Windows Scripting Self-Paced Learning Guide.
Last things first: the script header. It’s important, but honestly, it was the last thing I finished:
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' '' Adv_1.vbs '' written by Stephanie Peters, Microsoft PFE '' for the 2009 Summer Scripting Games '' '' The goal of this script is to read a particular text file '' (Personal Information Cards_ADV1.txt) and display the three '' shortest lines from that file. '' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''Option Explicit' Constants specify parameters required in the scenarioConst FILE_NAME = "Personal Information Cards_ADV1.txt"Const N_SHORTEST_LINES = 3' Parameter VariablesDim SKIP_EMPTY_LINESDim SKIP_WHITESPACE_ONLYDim TRIM_LINESDim UNIQUE' Script VariablesDim objLengthDictionary, objDupeDictionary Dim objFSO, objFile Dim strCurrentPath, strFilePath Dim strCurrentLine, intCurrentLineLength, strCurrentLineNumber, strPadDim blnSkipThisLineDim intMaxLineLength, intMinLineLengthDim intFileLineCounter, intLengthCheckCounter, intShortLineCounterDim aryCurrentLines
With mandatory housekeeping out of the way, it’s time to set up some reference items for later use. First, we read the arguments I mentioned earlier to provide flexibility in the way the user runs the script. The argument values are stored in the four variables you see below. These can be provided as command-line arguments to the script, as we will see later.
' Initialize ParametersSKIP_EMPTY_LINES = Wscript.Arguments.Named.Exists("SkipEmpty")SKIP_WHITESPACE_ONLY = Wscript.Arguments.Named.Exists("SkipWhiteLines")TRIM_LINES = Wscript.Arguments.Named.Exists("Trim")UNIQUE = Wscript.Arguments.Named.Exists("Unique")
Second, we’ll initialize some COM objects—two Dictionaries and a FileSystemObject—both of which are included with the Windows Scripting Host, so they will be available on the Scripting Games test computer. Seasoned VBScripters will be familiar with the Dictionary object, but others may not be. The Dictionary holds a dynamic collection of key/value pairs and is good for a couple of things we’re going to need in this script: (1) removing duplicate items from a list, and (2) generating a quick index of values.
The FileSystemObject object is fairly straightforward. It’s going to allow us to read the text file from the file system:
' Set up objects for later useSet objLengthDictionary = CreateObject("Scripting.Dictionary")Set objDupeDictionary = CreateObject("Scripting.Dictionary")Set objFSO = CreateObject("Scripting.FileSystemObject")
Before we open the “Personal Information Cards_ADV1.txt" file for reading, we have to locate it. I could have hard-coded the full path to it, but then it wouldn’t necessarily work for anyone else—especially because the file is currently in My Documents. We can use the ScriptFullName and ScriptName properties of the Wscript object to deduce the parent folder of the currently running script. This means that the text file and the script have to be in the same folder:
' Locate text file in the same folder with the currently running scriptstrCurrentPath = Replace(Wscript.ScriptFullName,Wscript.ScriptName,"")strFilePath = strCurrentPath & FILE_NAME
Now comes the processing of the text file—reading it one line at a time.
' Open the text file for readingSet objFile = objFSO.OpenTextFile(strFilePath)' Read the file one line at a timeDo Until objFile.AtEndOfStream ' Keep track of which line we are currently reading intFileLineCounter = intFileLineCounter + 1 strCurrentLine = objFile.ReadLine ' Reset blnSkipThisLine to False for new line blnSkipThisLine = False
Here I need to point out that I’ve been waiting about 12 years to use the logical implications (Imp) operator. I never found the occasion to use it, and I’m not sure why it struck me to use it now, but it perfectly suits what I need it to do—almost. Why have I never used it before? Fair warning: If logical operations make your head hurt, take my word for it and move on to the code.
In a logical implication, if the first value is False the result is always True, but if the first value is True, the result might be True depending on the second value. (See MSDN for a complete result chart for the Imp operator.)
Expression1
Expression2
Result
True
False
In our case, we actually need the negative of this logic. For instance, we have the Boolean setting SKIP_EMPTY_LINES. We can determine whether or not the current line is empty by using the expression CBool(Len(strCurrentLine)). The result is the value we would apply to the flag blnSkipThisLine to determine whether the current line should be skipped.
The result we want is as follows:
SKIP_EMPTY_LINES
CBool(Len(strCurrentLine))
blnSkipThisLine
This is the negative of the logical implication, which means we can pair it together with the logical negation operator Not to get the results we need. The same logic goes for SKIP_WHITESPACE_ONLY, except that we are using Trim () to remove the white space before checking the length.
I could have done this a different way, but when you get a chance to use something you have had your eye on for a long time, you just do it. (If I could think of a reason to use the StrReverse function, I think I would have used the whole language reference!)
' Use Logical implication to determine whether ' to skip lines based on whitespace whitespace If Not (SKIP_EMPTY_LINES Imp CBool(Len(strCurrentLine))) Then blnSkipThisLine = True ElseIf _Not (SKIP_WHITESPACE_ONLY Imp CBool(Len(Trim(strCurrentLine)))) Then blnSkipThisLine = True Else
Now that we know that either the SKIP- settings weren’t specified or the line actually contains non-white space characters, we can go about processing them. The next setting to check is whether or not the user wanted to trim leading and trailing white space from the lines before calculating their length:
' Trim leading and trailing whitespace if specified If TRIM_LINES Then strCurrentLine = Trim(strCurrentLine) End If
We might need to check that the values are unique based on the value of UNIQUE, which was passed in (or not) from the command line.
To remove duplicate values in VBScript, I always like to use a Dictionary object. In fact, 99 percent of the time when I use a Dictionary object, it’s for this purpose. In this case, we add the current line of text as a key in the Dictionary, and then we can check whether that item has been added before by using the Exists method. Because it’s the key we’re interested in, it doesn’t matter what value you assign to it, so I’m using Null.
' Determine whether to skip the line based on uniqueness ' if specified If UNIQUE Then ' If the line is in objDupeDictionary, then it has ' already been encountered and is a duplicate. If objDupeDictionary.Exists(strCurrentLine) Then blnSkipThisLine = True Else ' This item is not a duplicate and needs to be ' added to the running list. objDupeDictionary(strCurrentLine) = Null End If End If End If
At this point, we have definitively determined whether or not the current line should be skipped. If not, we’ll go ahead and process it. Meaning that we’ll determine the length of the line (keeping track of min and max lengths so that we’ll know what range we’ve accumulated), and store the line information in another Dictionary object, which is keyed on the length of the line:
' If a line hasn't been marked to skip, then process it If Not blnSkipThisLine Then intCurrentLineLength = Len(strCurrentLine) ' Keep track of max and min line lengths for later use. If intCurrentLineLength > intMaxLineLength Then intMaxLineLength = intCurrentLineLength End If If intCurrentLineLength < intMinLineLength Then intMinLineLength = intCurrentLineLength End If ' Pad the leading zeros on the text file line number so the ' display will be aligned. strPad = String(3-Len(intFileLineCounter), "0") strCurrentLineNumber = strPad & intFileLineCounter ' Prepend the line number to the line and format it for output. strCurrentLine = "Line " & strCurrentLineNumber & ": """ & _ strCurrentLine & """" ' If this line is a new length, then add it to the dictionary, ' otherwise append it to the existing item for this same length.
If the current line length has already been keyed, we’ll append the new line to whatever lines were already determined to have the same length. Otherwise, we’ll add the new length key and assign the current line to it:
If objLengthDictionary.Exists(intCurrentLineLength) Then objLengthDictionary(intCurrentLineLength) = _ objLengthDictionary(intCurrentLineLength) & _ vbNewLine & strCurrentLine Else objLengthDictionary(intCurrentLineLength) = strCurrentLine End If End IfLoop
After this, you might (depending on the arguments passed) have a Dictionary that looks something like this:
Key
Value
40
Line 001: "Understanding Personal Information Cards"
Line 011: "On the Start menu, click Control Panel."
Line 013: "Double click the Windows CardSpace icon."
421
Line 002: "CardSpace provides users the ability to access,…
0
Line 003: ""
26
Line 004: "Personal Information Cards"
805
Line 005: "Personal Information Cards (also called Self-…
36
Line 007: "Creating a Personal Information Card"
77
Line 008: "To create a Personal Information Card, start the…
…
This isn’t technically a sort, but it’s as close to one as we need for the purpose of this scenario, because now we can start at the minimum length we recorded and check every increment from there until we get the three shortest lines. After that, we don’t really care about the order of the rest of the lines.
This brings us to the output section of the script:
' intShortLineCounter tracks which of nth shortest line we're looking for.' We start at 1 because the first line we're looking for is the 1st shortest ' line.intShortLineCounter = 1For intLengthCheckCounter = intMinLineLength To intMaxLineLength ' If this length is found in the Dictionary, then we’ll use it. If objLengthDictionary.Exists(intLengthCheckCounter) Then ' write the header for this particular grouping wscript.echo wscript.echo "Number " & intShortLineCounter & _ " shortest line(s) - Length=" & intLengthCheckCounter & ":" wscript.echo "====================================" ' Empty lines must be handled separately If objLengthDictionary(intLengthCheckCounter)="" Then wscript.echo objLengthDictionary(intLengthCheckCounter) intShortLineCounter = intShortLineCounter + 1 Else ' In the event of a tie, the dictionary value will ' contain multiple values, so we need to separate them ' using the Split function aryCurrentLines = _ Split(objLengthDictionary(intLengthCheckCounter), _ vbNewLine) For Each strCurrentLine in aryCurrentLines wscript.echo strCurrentLine intShortLineCounter = intShortLineCounter + 1 Next End If ' Once the required number of shortest lines have been ' discovered, then we're done. If intShortLineCounter > N_SHORTEST_LINES Then Exit For End If End IfNext
It’s time to show what the output of the script is when you call it with the various arguments. The first one has no arguments, so it doesn’t skip any lines and also includes duplicates. Accordingly, there is a 16-way tie for the shortest line, and all of those lines have a length of zero:
CMD> cscript .\Adv_1.vbs //nologoNumber 1 shortest line(s) - Length=0:====================================Line 003: ""Line 006: ""Line 009: ""Line 012: ""Line 014: ""Line 016: ""Line 018: ""Line 020: ""Line 022: ""Line 024: ""Line 026: ""Line 029: ""Line 076: ""Line 079: ""Line 081: ""Line 083: ""
When we add the /unique switch, the empty lines are aggregated and only the first empty line number is shown:
CMD> cscript .\Adv_1.vbs /unique //nologoNumber 1 shortest line(s) - Length=0:====================================Line 003: ""Number 2 shortest line(s) - Length=1:====================================Line 033: " "Number 3 shortest line(s) - Length=4:====================================Line 067: "PPID"
When we add the /trim switch, leading and trailing spaces are ignored, which also causes the single-space lines to be aggregated with the zero-length lines. Here, we have a three-way tie for third place:
CMD> cscript .\Adv_1.vbs /unique /trim //nologoNumber 1 shortest line(s) - Length=0:====================================Line 003: ""Number 2 shortest line(s) - Length=4:====================================Line 067: "PPID"Number 3 shortest line(s) - Length=6:====================================Line 027: "Claims"Line 037: "Street"Line 064: "Gender"
The skipwhitelines switch will remove all empty or effectively empty lines from the output, making line 67 “PPID” the shortest line—followed by a three-way tie for second. Because four lines are accounted for, there is no reason to move beyond second place.
CMD> cscript .\Adv_1.vbs /unique /trim /skipwhitelines //nologoNumber 1 shortest line(s) - Length=4:====================================Line 067: "PPID"Number 2 shortest line(s) - Length=6:====================================Line 027: "Claims"Line 037: "Street"Line 064: "Gender"
Whew! That was a lot of code, and I’m tired. To be honest, I admit I would normally never have done this kind of heavy lifting with VBScript, which is so handicapped in the areas of filtering and sorting. Typically, I would just use the shell to execute the one or two lines of Windows PowerShell that it would take to do the exact same thing. This being the Scripting Games, though, it was a fun exercise, and I’m pretty happy with the result. Here is what the results look like when the script completes running.
Awesome work Steven, Kirk, and Stephanie! What a terrific way to conclude the first day of the Summer Scripting Games 2009. Join us tomorrow as we reveal Event 7. Tomorrow, we will also have solutions from another stellar group of commentators for Event 2—the long jump. For all the latest Scripting Games information, follow us on Twitter. See you tomorrow.
Ed Wilson and Craig Liebendorfer, Scripting Guys
All the Scripting Games links in one location! Let the learning begin. Review Submitted Scripts Event
Dear Scripting Guys,
I just worked a little bit through the description of event 1 and read the solutions here :-)
Wonderful solutions, and I wonder, if I should add another, less brilliant, one to these!
One note to Steven's solution:
I would change the
[regex]$regex = "^(?<Name>\w+?,\ \w+?)\s+(?<Country>.+?)\s+(?<Time>\d+\.\d+)$"
to
[regex]$regex = "^(?<Name>[^\t]+)\t(?<Country>[^\t]+)\t(?<Time>\d+\.\d+)$"
Why? Not, because this one is shorter :-) it is a little bit "more correct" because you can't decide whether
Hansen, Anne Grethe Japan 8.85
is a two components first name or you have a country with two components like in:
Pfeiffer, Michael United States 8.85
In fact I thought that this problem couldn't be solved at first sight, but a close look into the text file revealed, that the three components are seperated by TAB characters, which makes it well possible to seperate the three parts correctly!
kind regards, Klaus
Klaus,
Right you are. I actually caught that after I had submitted my solution a while ago and wanted to change my regexs to
PowerShell -
"^(?<Name>\w+?,\ .+?)\t(?<Country>.+?)\s+(?<Time>\d+\.\d+)$"
and Vbscript should be
"^(\w+?,\ .+?)\t(.+?)\s+(\d+\.\d+)$"
but it was a bit too late.
Hi,
what a great and simple solution for the advanced Powershell. The output will look a bit strange if the lines which are starting with a white-space gets enumerated.
To prevent this, I put the text file in an array and TRIM'ed away the white space for each entry. Output and sorted for the rest just like you did.
regards,
Patrik