Hey, Scripting Guy! Question

Hey, Scripting Guy! I've been asked to create a script which will read the contents of addresses.txt and output ONLY the MAC addresses listed in the text file. Could you give me some pointers or even a demo script with which to give me a boost?
-- FR

SpacerHey, Scripting Guy! AnswerScript Center

Hi FR,

The last time someone asked the person writing this column for a boost, the result was lower back pain that lasted for days. But this is a really common and important scripting problem and the scripts are pretty lightweight. So let’s develop a script that provides a framework for this type of task. You said in your email that you had already figured out how to read the file contents into a variable in your script. But we’ll begin there to be sure that everyone gets the full story.

We need to use the FileSystemObject object. So we have to write a bit of startup code that prepares that object for use.

Set objFS = CreateObject("Scripting.FileSystemObject")

Let’s assume that the text file we’re working with is not super large, so loading it into memory is not an issue. In that case, the next step is to slurp up all the content in the file and store it in a variable in your script. You do this by using the reasonably-named ReadAll method.

Set objFS = CreateObject("Scripting.FileSystemObject")
Set objFile = objFS.OpenTextFile("C:\logs\logfile.txt")
strFileContents = objFile.ReadAll

OK, so we now have the contents of C:\logs\logfile.txt stored in the variable named strFileContents. To see that this worked, you can display those contents by adding the line WScript.Echo strFileContents to the end of the script.

Note: If you do this, be sure to run the script by typing cscript.exe scriptname.vbs instead of using wscript.exe. That will ensure that the output is displayed in the command prompt window rather than in a popup message box.

Now, we want to do something (extract a MAC address in this case) to each line that we read in from the file. But, at this point, we have all of the lines munged together and stored as a single string of text.

Luckily, we haven’t lost the information about where one line ended and another began. Line breaks are stored as characters that your favorite text editor hides from you. (Well, to be fair, it doesn’t really hide them. It interprets them and breaks lines accordingly).

ReadAll stored those line break characters in the strFileContents variable. Because of that, we can use those characters to to break up the stored content into lines and store those lines in a type of variable that can hold a set of things. That means we want to split the contents into distinct lines and store the results in an array variable.

VBScript gives us a function that does exactly that. We’ll use two of the Split function’s arguments: a string to split up and a shorter string that indicates where to make the cuts. In our case, we want to snip the string whenever we encounter new line or carriage return linefeed characters. VBScript uses the symbol vbNewLine to indicate a new line.

So, the following code chops the content in strFileContents into lines and stores those lines in the array variable named arrLines.

arrLines = Split(strFileContents, vbNewLine)

Now that we have the lines stored in an array, we can use the handy For Each control structure to perform an operation on each line. Just to get warmed up, let’s display the contents of our file with line numbers. I know, that isn’t what FR asked about in his email. But a little experimentation is always a good idea when it comes to learning scripting. The following code should do the trick:

Set objFS = CreateObject("Scripting.FileSystemObject")
Set objFile = objFS.OpenTextFile("C:\logs\logfile.txt")
strFileContents = objFile.ReadAll

arrLines = Split(strFileContents,vbNewLine)
iLineNumber = 1
For Each strLine in arrLines
   WScript.Echo iLineNumber & ": " & strLine
   iLineNumber = iLineNumber + 1
Next

The important thing to understand here is that the For Each construct automatically stores the contents of the next line in the strLine variable. It then runs the script between the For Each and Next lines until all of the lines in arrLines have been processed.

Now, FR actually wanted to look for a MAC address in each of the lines and to output just that MAC address. He didn’t provide the exact format of the files he was scripting against. So, we’ll first assume that they are highly-structured with each line comprised of five values separated by commas and that the MAC address always appears in the second position. Here’s an example line:

192.168.2.3,00-00-00-00-00-00,Account04,08200701,T

When data is divided – or delimited – like this, we can extract a portion of the data by using the same trick that we used to recover lines of data after reading it all into a single variable. We can use the Split function. This time, the content to be split is stored in the strLine variable and the delimiting character is the comma.

The following line of code chops up a line into its constituent parts, delineated by commas, and stores those parts in an array called arrData.

arrData = Split(strLine, ",")

The content is now neatly-stored in the arrData array. If we were to do what we did last time, and use the For Each construct, we would end up outputting all of the data. But FR wants only the MAC addresses.

That means we need to output only the second thing that was stored in the arrData array. Arrays are meant to be accessed in that way. The following code displays just the second item stored in the array:

WScript.Echo arrData(1)

“Wait a minute”, I hear you saying. We want the second bit of data and you are using an index of 1 which will, presumably, return the first bit. This is an oddity of array indexing. It starts at 0. So arrData(0) is the first element of the array and arrData(1) is actually the second element – which is what we want.

Putting everything together, we now have the following:

Set objFS = CreateObject("Scripting.FileSystemObject")
Set objFile = objFS.OpenTextFile("C:\logs\logfile.txt")
strFileContents = objFile.ReadAll
arrLines = Split(strFileContents,vbNewLine)

For Each strLine in arrLines
   arrData = Split(strLine, ",")
   WScript.Echo arrData(1)
Next

This script gets the job done if we have structured data. We may need to adjust the index into the arrData array or change the delimiter character from a comma to something else. But it is fairly straightforward to tweak.

If FR’s data isn’t so well-structured, then we need to come up with a correspondingly more flexible way to extract the MAC addresses. A great tool for the job is called regular expressions. There’s nothing regular about regular expressions. They often look very odd. But the idea behind them is simple enough. A regular expression is a string that represents a pattern to match.

We want to look at each line and see if it contains a MAC address. So we need to find a regular expression that matches a MAC address. Regular expressions are great because they are powerful and they’ve been highly-used for so long that you can almost always search the Internet and find the pattern you need. Of course, if you’re interested in learning how to create your own regular expressions, there are loads of great references available – most of them easy to find with a simple search. A quick search yielded the following regular expression for matching MAC addresses: ((?:(\d{1,2}|[a-fA-F]{1,2}){2})(?::|-*)){6}.

We can use the same basic scripting framework that we used before. We just need to remove the code that relies on each line being delimited by commas. We’ll replace it with code that checks whether the line includes a match for our MAC address and, if so, outputs the portion of the line that matched.

To use regular expressions in VBScript, you start with the following preparatory statement:

Set objRegExp = new RegExp

You then load the regular expression that represents the pattern you want to match.

objReg.Pattern = "((?:(\d{1,2}|[a-fA-F]{1,2}){2})(?::|-*)){6}"

Finally, we test to determine if a line includes a match, indicating that it has a MAC address in it. If this test is positive, then we display the line.

If objRegExp.Test(strLine) Then
   WScript.Echo strLine
End If

This is nearly what we want. But instead of the whole line, we just want to display the bit of the line that matched the regular expression. This following code does just that. The Execute method of the RegExp object returns a collection of all of the strings that matched the regular expression stored in the Pattern property. We can then use our trusty For Each construct to display all of those matches – even though we are pretty sure there will be only one match per line.

Set colMatches = objRegExp.Execute(strLine)
For Each strMatch in colMatches
   WScript.Echo strMatch
Next

Putting this all together, we end up with the following script:

Set objFS = CreateObject("Scripting.FileSystemObject")
Set objRegExp = new RegExp
objRegExp.Pattern = "((?:(\d{1,2}|[a-fA-F]{1,2}){2})(?::|-*)){6}"
Set objFile = objFS.OpenTextFile("C:\logs\logfile.txt")
strFileContents = objFile.ReadAll
arrLines = Split(strFileContents,vbNewLine)

For Each strLine in arrLines
   Set colMatches = objRegExp.Execute(strLine)
   For Each strMatch in colMatches
      WScript.Echo strMatch
   Next
Next

This script is a good starting point for handling any sort of text file manipulation. Experiment with it. Here are a few things to try:

Read in a file and output the first word that appears on each line

Read in two files and output them one after the other

Read in a file and output the lines in reverse order

Read in a file and only output lines that match a regular expression that you found on the Internet

So, there you go FR. We have you covered whether your data is structured or is as out-of-whack as my back!