Summary: Microsoft Scripting Guy, Ed Wilson, talks about using Windows PowerShell to fix the date in a CSV file prior to data export.

Microsoft Scripting Guy, Ed Wilson, is here. This morning I am sipping a cup of English Breakfast tea. I added a cinnamon stick and a bit of lemon grass to the pot. Of course before I did anything, I primed my pot—that is, I filled the pot half full with hot water and let the pot sit for a couple of minutes. This has two benefits: it warms the pot and it helps ensure that there is no leftover tastes or smells in the pot to ruin the pot of tea.

I am sitting at the bar in our kitchen and using my Surface Pro 3 to check email sent to scripter@microsoft.com. The questions and comments are always a great way to start the day. I also have a fresh cantaloupe that we picked up yesterday at the farmers market. The farmers market is really cool, and is in the center of town.

The Scripting Wife and I like to get up early, walk there, and pick up stuff for our meals for the day. It is nice to meet the person growing the food and to know where the food really comes from. I like the connection. It keeps it real and helps me appreciate the labor involved in farming. There is no Get-Cantaloupe cmdlet. The fruit only comes after hours of backbreaking work. There is not a lot of room for automation around growing them.

I have been trying to make sense of the seriously messed up data that I have to read and bring into another database. I decided the first step will be to write some code to transform three number fields into a single DateTime object. The three number fields are shown here:

Image of spreadsheet

SQL Server loves the .NET System.DateTime object and is able to work well with it. It makes handling things like date manipulation a breeze when it comes to reporting. Therefore, the options are to bring three number fields into my database or to convert the three numbers into a System.DateTime object before I import the data. The latter option is the best way to do things.

The process is pretty easy because of the way Windows PowerShell does type conversions. In fact, I can emulate the process right at the Windows PowerShell console. Let's say I have the following string:

"8/26/14"

I can verify that it is a string by using the GetType method as shown here:

PS C:\> "8/26/14".Gettype() 

IsPublic IsSerial Name                                     BaseType

-------- -------- ----                                     --------

True     True     String                                   System.Object

To turn it into a date, all I need to do is to cast it to a DateTime object. I do this by using the System.DateTime type accelerator. This is shown here:

PS C:\> [datetime]"8/26/14"

Tuesday, August 26, 2014 12:00:00 AM

It is obvious that I now have a DateTime object, but I can again verify it by using GetType:

PS C:\> ([datetime]"8/26/14").GetType()

IsPublic IsSerial Name                                     BaseType

-------- -------- ----                                     --------

True     True     DateTime                                 System.ValueType

My task is to combine the three fields, add my date separators to the three fields for Month, Day, and Year, and then cast it to a DateTime object. Here is the code I use:

[datetime]("{0}/{1}/{2}" -f $d.month, $d.day, $d.year)

As I cycle through the collection of rows, I pick up the Month, Day, and Year fields from each row. I use the format operator, and I substitute the three numbers (Month, Day, and Year) into the position holders {0}, {1}, and {2}. Because this is a format string, I can add in my date separator ( / ).

I decide to output a custom object with one property: Date. I do this by using the [PSCustomObject] type accelerator in Windows PowerShell 4.0. The complete FixDate.Ps1 script is shown here:

$datain = import-csv C:\DataIn\DataDump.csv

Foreach ($d in $datain)

    {

        [PSCustomObject]@{

            Date = [datetime]("{0}/{1}/{2}" -f $d.month, $d.day, $d.year) }  }

When I run the script, it reads the CSV file, cycles through each record, reads the Day, Month, and Year fields, and creates the DateTime objects. This is shown in the following image:

Image of command output

That is all there is to using Windows PowerShell to clean up date information in a CSV file. Data Manipulation Week will continue tomorrow when I will talk about fixing the names.

I invite you to follow me on Twitter and Facebook. If you have any questions, send email to me at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.

Ed Wilson, Microsoft Scripting Guy