Hey, Scripting Guy! Event 3 *Solutions* from Expert Commentators (Beginner and Advanced; the shot put)

Hey, Scripting Guy! Event 3 *Solutions* from Expert Commentators (Beginner and Advanced; the shot put)

  • Comments 2
  • Likes

  

(Note: These solutions were written for Event 3.)   

Beginner Event 3: The Shot Put

In the shot put event, you must be able to handle a heavy load of text. To make it easier for you to carry the load, you will be asked to balance text between two files. The detailed event description was revealed last Wednesday.

Guest commentator: Alex K. Angelopoulos

Image of guest commentator Alex K. Angelopoulos

Alex Angelopoulos is a former Microsoft scripting MVP. He maintains the Windows Scripting, Web site which is a veritable treasure trove of information related to scripting, with an emphasis on Microsoft Visual Basic Scripting Edition (VBScript).

VBScript solution

There are several ways to use and complete the VBScript for the Beginner’s Shot Put event. I chose a particularly rapid technique that reads the entire file all at once, and then uses VBScript's split method to break the file into two parts.

In its simplest form, the split method takes an arbitrary line of text and splits it at the spaces in the string. Therefore, you can do something like this:

words = Split("two words")
for each word in words
          WScript.Echo word
next

The previous code returns two words, each of which is on its own line. This is seen here:

two
words

The split method can also take an optional second argument that tells VBScript to look for something to split on other than just a space. If you change the initial line shown above to include the second parameter, you will get something that looks like this:

words = Split("two words", "o w")

The output is now completely different, and is shown here:

tw
ords

The split method works just as easily with displayed and nondisplayed characters. Embedded line terminations are characters that are not displayed when you open a text file but are located in the file. In Windows, the default line termination is a sequence of two charactersthe carriage return and the line feed. VBScript even includes a built-in constant representing this two-character sequence, vbCrLf.

What is significant for the shot put data is that the boundary between the two paragraphs is a blank line. Even though there are no visible characters on that line, it is preceded by a carriage return and a line feed, and is immediately followed by a carriage return and a line feed. This character sequence written in VBScript is seen here:

 

vbCrLf & vbCrLf

It is also important to realize that this character sequence appears nowhere else in the data. Therefore, if we read the contents of the entire file in one step, we can split it on a vbCrLf and vbCrLf, instead of reading the file line by line. This might seem a little confusing, so let us take a look at the solution.

Here is a complete solution using the handy split method, with embedded commentary and some commented-out code at the end that is usable for test cycling.

BeginnerEvent3Solution.vbs

Option Explicit
' we get a reference to the FileSystemObject (a.k.a. FSO)
Dim fso: Set fso = CreateObject("Scripting.FileSystemObject")
' Constants used to control the mode for opening files
Const ForReading = 1, ForWriting = 2

' open the data file for reading.
' We don't need to explicitly specify ForReading because this is the
' default mode for opening a text file.
Dim file: Set file = fso.OpenTextFile("Shot Put.txt", ForReading)

' this reads the file contents into the contents variable
Dim contents: contents = file.ReadAll

' Because we're done with the file now, we close it.
' If we don't, we'll get a "Permission denied" error later
'  when we try to rename it.
file.Close

' Now split contents on the repeated line ending;
' this gives a 2-member array with multiline text blocks in each one.
Dim data: data = Split(contents, vbCrLf & vbCrLf)

' Now we create both files using OpenTextFile.
' We could have used
Dim FileA, FileB
Set FileA = fso.OpenTextFile("Shot Put A.txt", ForWriting, True)
FileA.Write data(0)
Set FileB = fso.OpenTextFile("Shot Put B.txt", ForWriting, True)
FileB.Write data(1)
fso.MoveFile "Shot Put.txt", "Shot Put.old"
' following lines were used to speed up the testing cycle;
' after moving the file, I make sure it exists from the script,
' then rename it back to the old name so I don't need to go
' in and change the file extension via Windows Explorer.
' FileExists returns a true/false value, which gets echoed
' as either -1 or 0. If we force it to string form using CStr,
' we get a readable "True" or "False" instead.
'WScript.Echo "file was renamed?", CStr(fso.FileExists("Shot Put.old"))
'fso.MoveFile "Shot Put.old", "Shot Put.txt"

When we run the script, two text files are created. Those files are seen here:

Image of text files created by script

This is extremely wordy code, but it has the advantage of being readable. This also means it is easier to maintain. Although not nearly as readable, an experienced scripter could condense this down to the following five lines, although I do not recommend it for anything but a throwaway script:

BeginnerEvent3_CompactVersion.vbs

Set fso = CreateObject("Scripting.FileSystemObject")
data = Split(fso.OpenTextFile("Shot Put.txt").ReadAll, VbCrLf & vbCrLf)
fso.CreateTextFile("Shot Put A.txt").Write data(0)
fso.CreateTextFile("Shot Put B.txt").Write data(1)
fso.MoveFile "Shot Put.txt", "Shot Put.old"

Guest commentator: Alex K. Angelopoulos

Scripting Guys Note: As it turns out, Alex is bilingualhe scripts in both VBScript and in Windows PowerShell. When he told us that he did his prototype VBScript in Windows PowerShell, we asked him to go ahead and submit his Windows PowerShell solution to Beginner Event 3. In fact, Alex is not unusual in this regard. We also use Windows PowerShell to prototype a solution to a VBScript problem. We do this often when we get questions such as, “I am trying to do such and such with VBScript, but the Windows Management Instrumentation ( WMI) class is not returning any data.” Windows PowerShell is just so much faster to test a WMI class that we always check the class by using Windows PowerShell before writing a VBScript answer to the problem. For more information about using Windows PowerShell to query WMI, check out the Hey, Scripting Guy! articles from the week of March 6, 2009. 

Windows PowerShell solution

To solve the shot put problem with Windows PowerShell, you must understand how to use the Get-Content cmdlet. The Get-Content cmdlet provides you with the name of a file and it will automatically retrieve all of the data from the file. With the shot put problem, however, when you use Get-Content the easy way, it actually gives us a harder problem to solve. This is due in part to the design of Get-Content.

The design of Get-Content reflects the Windows PowerShell preference for pipeline processing. Instead of reading data in as a monolithic block of text, Get-Content returns the file line by line. This means that when you read the contents of a file and store the results in a variable, you get a collection of lines from the text file. This is seen here:

$data = Get-Content '.\Shot Put.txt'

Because there are multiple line breaks in the file, you end up having an array of nine elements, each containing one line of text from the file. You can see that the data is stored in an array by querying the count property. This is seen here:

$data.Count

Because the goal of the shot put event is to split the data file into two separate paragraph files by using the blank line in the file as a boundary marker, the obvious way to work with this problem is by finding and locating the blank line. The following code checks the length of each line that is stored in the $data variable. When it finds the empty line, it exits with the index of the empty line stored in the variable $i. This is seen here:

for($i = 0; $i -lt $data.Count; $i++){if($data[$i].Length -eq 0){break}}

We can then use the Windows PowerShell array notation to remove all of the lines before the empty line and put them into one file. We will repeat this process with the lines of text after the empty line. This is shown here:

 

$data[0..($i - 1)] | Set-Content '.\Shot Put A.txt'
$data[($i + 1)..($data.Count - 1)] | Set-Content '.\Shot Put B.txt

This is an effective and fairly compact solution. However, it makes the problem more complicated than necessary. The Get-Content delimiter parameter allows us to specify any arbitrary string as a delimiter to use when separating data chunks in a file. Because we must split the file at the point where there is an empty line, we can use the empty line as a delimiter. Although the line is empty, it marks a point where there are two line terminations in a row. Because a standard line termination in Windows is a carriage return followed by a line feedrepresented with the escape character sequence `r`n in Windows PowerShellwe can use `r`n`r`n as our delimiter. This is shown here:

$data = Get-Content '.\Shot Put.txt' -Delimiter "`r`n`r`n"

In one step, we have imported the data from the Shot Put.txt file as an array of two elements, each containing data for a paragraph file. We just write the data to the files and we are finished. This is seen here:

$data[0] | Set-Content '.\Shot Put A.txt'
$data[1] | Set-Content '.\Shot Put B.txt'

Here’s the full script:

BeginnerEvent3Solution.ps1

$data = Get-Content '.\Shot Put.txt' -Delimiter "`r`n`r`n"
$data[0] | Set-Content '.\Shot Put A.txt'
$data[1] | Set-Content '.\Shot Put B.txt'

You could also write a solution to the Shot Put Beginner Event by using the .NET Framework classes. If you did this, it might look something like the following:

BeginnerEvent3Solution_Net_Classes.ps1

[string]$data = [IO.File]::ReadAllText($(Resolve-Path '.\Shot Put.txt'))
$split = [Text.RegularExpressions.Regex]::Split($data, "`r`n`r`n")
$split[0] | Set-Content -Path '.\Shot Put A.txt'
$split[1] | Set-Content -Path '.\Shot Put B.txt'
Rename-Item '.\Shot Put.txt' '.\Shot Put.old'
# Following line for debugging purposes
Rename-Item '.\Shot Put.old' '.\Shot Put.txt'

Advanced Event 3: The Shot Put

The shot put event involves throwing a heavy, metal ball. Some people think decathlon events are all the same. For the shot put event, you are required to find words in a file that contain all the same vowels.

Guest commentator: Scott Hanselman

Image of guest commentator Scott Hanselman

Scott Hanselman is a principal program manager at Microsoft. He maintains the computer zen web site and can be found on Twitter.

VBScript solution

In the Advanced Shot Put Event, we must search for a file that contains words that have the same vowels. We will use the FileSystemObject object to read a text file. After we create the FileSystemObject and store it in a variable, we use the OpenTextFile method to open the text file, and the CreateTextFile method to create a new text file. This is seen here:

Set fso = CreateObject("Scripting.FileSystemObject")
Set f1 = fso.OpenTextFile("Wordlist_ADV3.txt")
Set f2 = fso.CreateTextFile("JustVowels.txt", True)

We then use the Do…While…Loop to walk through the text file, one line at a time. We use a variable, isUniVowel and set it equal to True. We use the MID function to examine each letter in the string. We call the isVowel function to see if the letters within the string are vowels. If they are vowels, we continue through the string to see if there is an additional vowel that is not equal to the first vowel. If we find more than one vowel, we set isUniVowel to False. This is seen here:

For i = 0 to strlen - 1
        maybeVowel = Mid(s, i+1, 1)
        if     IsVowel(maybeVowel) then
            if firstVowel = "" Then
                'Wscript.Echo "Found " + maybeVowel
                firstVowel = maybeVowel
            End If
            'Wscript.Echo "Testing " + maybeVowel + " against " + firstVowel 
            If firstVowel <> maybeVowel Then
                isUniVowel = False
            End If
        End If
    Next

The Complete AdvancedEvent3Solution.vbs script is shown here:

AdvancedEvent3Solution.vbs

Dim fso, f1, ts, s
Const ForReading = 1
Set fso = CreateObject("Scripting.FileSystemObject")
Set f1 = fso.OpenTextFile("Wordlist_ADV3.txt")
Set f2 = fso.CreateTextFile("JustVowels.txt", True)

Do While Not f1.AtEndOfStream
    s = f1.ReadLine
    strLen = Len(s)
    Dim firstVowel
     firstVowel = ""
    Dim isUniVowel
    isUniVowel = True
    For i = 0 to strlen - 1
        maybeVowel = Mid(s, i+1, 1)
        if     IsVowel(maybeVowel) then
            if firstVowel = "" Then
                'Wscript.Echo "Found " + maybeVowel
                firstVowel = maybeVowel
            End If
            'Wscript.Echo "Testing " + maybeVowel + " against " + firstVowel 
            If firstVowel <> maybeVowel Then
                isUniVowel = False
            End If
        End If
    Next
    if (isUniVowel) Then
        Wscript.Echo s
        f2.WriteLine s
    End If
loop
f2.Close

Function IsVowel( sFnd )
If (sFnd = "a") or _
    (sFnd = "e") or _
     (sFnd = "i") or _
     (sFnd = "o") or _
    (sFnd = "u") Then
     IsVowel = True
Else
    IsVowel = False
End If
End Function

Guest commentator: Shay Levy

Image of guest commentator Shay Levy

Shay Levy is a Windows PowerShell MVP and a moderator for the Official Scripting Guys Forum. He maintains the Script Fanatic blog, and is the author of the PowerShell toolbar for Internet Explorer. You can also follow him on Twitter.

Windows PowerShell solution

In English, there are five letters that always represent a vowel when written: a, e, i, o, and u. This Advanced event requires that we get all of the univowel words from a given file and write them to a new file. There are two rules we must follow:

  • A word can have only one unique vowel (for example, "go").
  • The same vowel can appear more than once in the same word (for example, "good").

Our approach to this script is to break the solution into three tasks:

1.     Get the file content.

2.     Process the univowel words.

3.     Export the univowel words to a new file.

The first thing we must do is get the file content and examine the words found in the file. Getting the file content is basically an easy task. We use the Get-Content cmdlet, which reads the file content one line at a time and returns an object for each line. In our task, each line represents a word. Here are the first 10 items of the WordList_Adv3.txt file:

PS > Get-Content .\Wordlist_ADV3.txt | Select-Object -First 10
the
be
of
and
a
to
in
he
have
7 it

 

As you can see, the first line contains the word “the”. The word is a univowel word because it contains only the vowel “e”. On the other hand, line number nine contains the word “have”; it is not a legal univowel word because it contains two vowels, “a” and “e”.

Now let us skip to the third part of our task and see how we can export the words (that we have not found yet) to a file. We can redirect the output to a file with the Out-File cmdlet:


PS > Out-File  .\ univowel .txt

So we know how to get the content of the file and how to export the results to a new file. We can write a temporary command that looks like the following:


PS  >  Get-Content .\Wordlist_ADV3.txt | ... | Out-File .\univowel.txt -Encoding ASCII

Finally, we must solve part of number two, which will process each line and check if the current word is univowel. To accomplish this part of the script, we discover if a word contains vowel characters by using the Match operator and a regular expression pattern. This is seen here:

if($word -match '[aeiou]' )
{
          Write-Host "Word: $word contain vowels"
}

The regular expression pattern [aeiou] is a "character class." A character class matches only one out of several characters. We can now filter words that contain vowels by using the Where-Object cmdlet. The Where-Object creates a filter that controls which objects will be passed along the command pipeline if the expression evaluates to True. Our updated command is seen here:

PS  >  Get-Content .\Wordlist_ADV3.txt | Where-Object {$_ -match '[aeiou]' } | Out-File .\univowel.txt -Encoding ASCII

But that is not the only part of task number two. We also must export the univowel words, and for that we can use the String.ToCharArray method. With this method, we can convert a string to an array of characters. This is seen here:

PS > "have".ToCharArray()
h
a
v
e

We take the results of the ToCharArray method, match them against the [aeiou] regular expression pattern, and pipe the results to the Group-Object cmdlet. With the Group-Object cmdlet, we can display objects in groups based on the value of a specified property. The result is a table with one row (object) for each property value, and a column that displays the number of items with that value. This is seen here:

PS > "have".ToCharArray() -match '[aeiou]' | Group-Object
Count Name Group
----- ---- -----

    1 a    {a}
    1 e    {e}

 

Next we pipe the results to the Measure-Object cmdlet. We take only the results where the count is equal to 1.This will exclude the words with more than one vowel. This is seen here:

PS > "choose".ToCharArray() -match '[aeiou]'| Group-Object

Count   Name    Group
-----        ----         -----

    2          o           {o, o}

    1          e           {e}

 

PS > "choose".ToCharArray() -match '[aeiou]'| Group-Object | Measure-Object

Count    : 2
Average  :
Sum      :
Maximum  :
Minimum  :
Property :

 

We are very close to solving this event. The result of the above expression returns two objects: one “a” and one “e”. This means that if more than one object is returned, the word is multivoweled. Otherwise, it is a univowel word. Let's test our approach again:

PS > ("the".ToCharArray() -match '[aeiou]' | Group-Object | Measure-Object).count -eq 1
True

Because we obtained True, this means only one vowel exists in the word. Let's try it with a word that has more than one vowel in it:

PS > ("have".ToCharArray() -match '[aeiou]' | Group-Object | Measure-Object).count -eq 1
False

Now we must integrate the expression into a Where-Object cmdlet and take a look at the last 10 items. This is seen here:

PS > Get-Content d:\Wordlist_ADV3.txt | Where-Object { ($_.ToCharArray() -match '[aeiou]' | Group-Object | Measure-Object).Count -eq 1 } | Select-Object -Last 10
paw
tidy
mend
thorn
chalk
berry
pastry
scold
whom
sow

Now that we are confident in our approach, we can assemble the complete command. It is shown here:

AdvancedEvent3Solution.ps1

Get-Content .\Wordlist_ADV3.txt |
Where-Object { ($_.ToCharArray() -match '[aeiou]' |
Group-Object | Measure-Object).Count -eq 1 } |
Out-File -FilePath .\univowel.txt -Encoding ASCII

When we run the AdvancedEvent3Solution.ps1 script, a text file is created. You will notice that this text file contains 575 lines. Because each univowel word occupies a single line, we have discovered 575 univowel words:

Image of file with 575 univowel words

 

All right! We have come to the end of Event 3. Alex, Scott, and Shay have certainly raised the bar. We don't know about you, but we have learned things already. Join us tomorrow as we reveal the details for the pole vault event. We will also bring in another group of guest commentators to share their answers to Event 4. Remember, you can find support on the Scripting Games Forum, and receive up-to-the-minute Scripting Games news and bulletins by following us on Twitter.


Ed Wilson and Craig Liebendorfer, Scripting Guys

 

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
  • All the Scripting Games links in one location! Let the learning begin. (We will update this page every

  • Hello Scripting guys,

    thank you very much for the wonderful solutions again! Brilliant ... and I really begin to like that powershell stuff :-)

    @Shay: You lost the "Measure-Object" somewhere ... somehow *sss*

    Your solution will not meet condition 2:

    "The same vowel can appear more than once in the same word (for example, "good")"

    If you look at the notepad output, you will be missing the "sweeten" e.g. at the end:

    whom

    sweeten

    sow

    So I would add the "Measure-Object" again:

    Get-Content .\Wordlist_ADV3.txt |

    Where-Object { ($_.ToCharArray() -match '[aeiou]' |

    Group-Object | Measure-Object).Count -eq 1 } |

    Out-File -FilePath .\univowel.txt -Encoding ASCII

    and it will produce the same results as my regex:

    gc Wordlist_ADV3.txt | %{ if ($_ -match "^[^aeiou]*([aeiou])([^aeiou]|\1)*$") { $_ }} | Out-File Univowel.txt ASCII

    kind regards, Klaus