Four Easy Ways to Import CSV Files to SQL Server with PowerShell

Four Easy Ways to Import CSV Files to SQL Server with PowerShell

  • Comments 8
  • Likes

Summary: Learn four easy ways to use Windows PowerShell to import CSV files into SQL Server.

Microsoft Scripting Guy, Ed Wilson, is here.  I was chatting this week with Microsoft PowerShell MVP, Chad Miller, about the series of blogs I recently wrote about using CSV files. He thought a helpful addition to the posts would be to talk about importing CSV files into a SQL Server. I most heartily agreed. Welcome to Guest Blogger Week. We will start off the week with a bang-up article by Chad Miller. Chad has previously written guest blogs for the Hey, Scripting Guy! Blog. Here is a little information about Chad:

Chad Miller is a SQL Server database admin and the senior manager of database administration at Raymond James Financial. In his spare time, he is the project coordinator and developer of the CodePlex project SQL Server PowerShell Extensions (SQLPSX). Chad leads the Tampa Windows PowerShell User Group, and he is a frequent speaker at SQL Saturdays and Code Camps.

Contact information:
Blog: Sev17
Twitter: cmille19

Importing CSV files into SQL Server

Windows PowerShell has built in support for creating CSV files by using the Export-CSV cmdlet. However, the creation of a CSV file is usually only a short stop in an overall process that includes loading the file into another system. In this post, we'll look at a few scripted-based approaches to import CSV data into SQL Server. Note: SQL Server includes a component specifically for data migration called SQL Server Integration Services (SSIS), which is beyond the scope of this article.

T-SQL BULK INSERT command

The T-SQL BULK INSERT command is of the easiest ways to import CSV files into SQL Server. The BULK INSERT command requires a few arguments to describe the layout of the CSV file and the location of file. Let's look at an example of creating a CSV file by using Export-CSV, and then importing the information into a SQL Server table by using BULK INSERT.

Requirements

  • Sysadmin or insert and bulkadmin to SQL Server
  • Local access to SQL Server

Setup

1. Download the following script: Invoke-SqlCmd2.ps1

2. Create a table disk space by copying the following code in SQL Server Management Studio. 

Note: The example uses a database named "hsg."

CREATE TABLE dbo.diskspace(

UsageDate datetime,

SystemName varchar(50),

Label varchar(50),

VolumeName varchar(50),

Size decimal(6,2),

Free decimal(6,2),

PrecentFree decimal(5,2)

)

The following image shows the command in SQL Server Management Studio.

Image of query

3. Save the following script as Get-DiskSpaceUsage.ps1, which will be used as the demonstration script later in this post.

param($ComputerName=".")

Get-WmiObject -computername "$computername" Win32_Volume -filter "DriveType=3" | foreach {

new-object PSObject -property @{

UsageDate = $((Get-Date).ToString("yyyy-MM-dd"))

SystemName = $_.SystemName

Label = $_.Label

VolumeName = $_.Name

Size = $([math]::round(($_.Capacity/1GB),2))

Free = $([math]::round(($_.FreeSpace/1GB),2))

PercentFree = $([math]::round((([float]$_.FreeSpace/[float]$_.Capacity) * 100),2))

}

} | Select UsageDate, SystemName, Label, VolumeName, Size, Free, PercentFree

Now we will use the script Get-DiskSpaceUsage.ps1 that I presented earlier. It lists information about disk space, and it stores the information in a CSV file.

./get-diskusage.ps1 | export-csv -Path "C:\Users\Public\diskspace.csv" -NoTypeInformation

The generated CSV file shows that Export-CSV includes a text delimiter of double quotes around each field:

"UsageDate","SystemName","Label","VolumeName","Size","Free","PercentFree"

"2011-11-20","WIN7BOOT","RUNCORE SSD","D:\","59.62","31.56","52.93"

"2011-11-20","WIN7BOOT","DATA","E:\","297.99","34.88","11.7"

"2011-11-20","WIN7BOOT",,"C:\","48","6.32","13.17"

"2011-11-20","WIN7BOOT","HP_TOOLS","F:\","0.1","0.09","96.55"

Although many programs handle CSV files with text delimiters (including SSIS, Excel, and Access), BULK INSERT does not. To use BULK INSERT without a lot of work, we'll need to remove the double quotes. We can use a quick and dirty way of simply replacing all the quotes in the CSV file. In the blog post Remove Unwanted Quotation Marks from CSV Files by Using PowerShell, the Scripting Guys explains how to remove double quotes. This method can be used for circumstances where you know it won't cause problems. How do you know? Well, the data being generated from our Get-DiskspaceUsage should never have double quotes or commas in the data. So here's the code to remove the double quotes:

(Get-Content C:\Users\Public\diskspace.csv) | foreach {$_ -replace '"'} | Set-Content C:\Users\Public\diskspace.csv

UsageDate,SystemName,Label,VolumeName,Size,Free,PercentFree

2011-11-20,WIN7BOOT,RUNCORE SSD,D:\,59.62,31.56,52.93

2011-11-20,WIN7BOOT,DATA,E:\,297.99,34.88,11.7

2011-11-20,WIN7BOOT,,C:\,48,6.32,13.17

2011-11-20,WIN7BOOT,HP_TOOLS,F:\,0.1,0.09,96.55

Now we are ready to import the CSV file as follows:

. .\Invoke-SqlCmd2.ps1

 

$query = @"

BULK INSERT hsg.dbo.diskspace FROM 'C:\Users\Public\diskspace.csv'

WITH (FIRSTROW = 2, FIELDTERMINATOR = ',', ROWTERMINATOR = '\n')

"@

 

Invoke-SqlCmd2 -ServerInstance "$env:computername\sql1" -Database hsg -Query $query

The following data shows that our CSV file was successfully imported.

UsageDate

System
Name

Label

Volume
Name

Size

Free

Percent
Free

11/20/2011 12:00:00 AM

WIN7BOOT

RUNCORE SSD

D:\

59.62

31.56

52.93

11/20/2011 12:00:00 AM

WIN7BOOT

DATA

E:\

297.99

34.88

11.70

11/20/2011 12:00:00 AM

WIN7BOOT

C:\

48.00

6.32

13.17

11/20/2011 12:00:00 AM

WIN7BOOT

HP_TOOLS

F:\

0.10

0.09

96.55

BULK INSERT works reasonably well, and it is very simple. However, there are some drawbacks, including:

  • You need elevated permissions on SQL Server.
  • BULK INSERT doesn't easily understand text delimiters.
  • Using the UNC path to files requires an additional setup, as documented under Permissions on the BULK INSERT site.

For these reasons, let's look at some alternate approaches.

Before there was Windows PowerShell, there was LogParser

LogParser is a command-line tool and scripting component that was originally released by Microsoft in the IIS 6.0 Resource Kit. LogParser provides query access to different text-based files and output capability to various data sources including SQL Server. Even though this little tool hasn't been updated since 2005, it has some nice features for loading CSV files into SQL Server.

Setup

Download and install LogParser 2.2.

LogParser can do a few things that we couldn't easily do by using BULK INSERT, including:

  • Automatically create a table based on the CSV layout
  • Handle the text delimiter of double quotes

Note: CSV files do not need to be local.

Using LogParser

You can use the LogParser command-line tool or a COM-based scripting interface. Let's look at examples of both.

LogParser command-line tool

LogParser requires some special handling, which is why we use Start-Process. Some switches and arguments are difficult to work with when running directly in Windows PowerShell. Also Windows Powershell_ISE will not display output from LogParser that are run via the command-line tool. Here is the syntax for running a command to generate and load a CSV file:

./get-diskspaceusage.ps1 | export-csv -Path "C:\Users\Public\diskspace.csv" -NoTypeInformation -Force

#Uncomment/comment set-alias for x86 vs. x64 system

#set-alias logparser "C:\Program Files\Log Parser 2.2\LogParser.exe"

set-alias logparser "C:\Program Files (x86)\Log Parser 2.2\LogParser.exe"

start-process -NoNewWindow -FilePath logparser -ArgumentList @"

"SELECT * INTO diskspaceLP FROM C:\Users\Public\diskspace.csv" -i:CSV -o:SQL -server:"Win7boot\sql1" -database:hsg -driver:"SQL Server" -createTable:ON

"@

Looking at SQL Server, we see that our newly created table contains the CSV file:

Filename

Row
Number

Usage
Date

System
Name

Label

Volume
Name

Size

Free

Percent
Free

C:\Users\Public\diskspace.csv

2

2011-11-20

WIN7BOOT

RUNCORE SSD

D:\

59.62

31.56

52.93

C:\Users\Public\diskspace.csv

3

2011-11-20

WIN7BOOT

DATA

E:\

297.99

34.88

11.7

C:\Users\Public\diskspace.csv

4

2011-11-20

WIN7BOOT

C:\

48

6.32

13.16

C:\Users\Public\diskspace.csv

5

2011-11-20

WIN7BOOT

HP_TOOLS

F:\

0.1

0.09

96.55

The CreateTable switch will create the table if it does not exist; and if it does exist, it will simply append the rows to the existing table. Also notice that we got two new columns: Filename and Row Number, which could come in handy if we are loading a lot of CSV files. You can eliminate the Filename and Row Number columns by specifying the column list in the Select statement as we'll see in a moment.

LogParser COM scripting

Using the COM-based approach to LogParser is an alternative method to using the command line. Although the COM-based approach is a little more verbose, you don't have to worry about wrapping the execution in the Start-Process cmdlet. The COM-based approach also handles the issue with Windows Powershell ISE. Here is code to work with the COM object:

$logQuery = new-object -ComObject "MSUtil.LogQuery"

$inputFormat = new-object -comobject "MSUtil.LogQuery.CSVInputFormat"

$outputFormat = new-object -comobject "MSUtil.LogQuery.SQLOutputFormat"

$outputFormat.server = "Win7boot\sql1"

$outputFormat.database = "hsg"

$outputFormat.driver = "SQL Server"

$outputFormat.createTable = $true

$query = "SELECT UsageDate, SystemName, Label, VolumeName, Size, Free, PercentFree INTO diskspaceLPCOM FROM C:\Users\Public\diskspace.csv"

$null = $logQuery.ExecuteBatch($query,$inputFormat,$outputFormat)

The main drawback to using LogParser is that it requires, well…installing LogParser. For this reason, let's look at one more approach.

Use Windows PowerShell to collect server data and write to SQL Server

In my previous Hey, Scripting Guy! post, Use PowerShell to Collect Server Data and Write to SQL, I demonstrated some utility functions for loading any Windows PowerShell data into SQL Server. Let's revisit this solution using the CSV file example:

Setup

Download the following scripts:

Run the following code to create a CSV file, convert to a data table, create a table in SQL Server, and load the data:

. .\out-datatable.ps1

. .\Add-SqlTable.ps1

. .\write-datatable.ps1

. .\Invoke-SqlCmd2.ps1

$dt = .\Get-DiskSpaceUsage.ps1 | Out-DataTable

Add-SqlTable -ServerInstance "Win7boot\Sql1" -Database "hsg" -TableName diskspaceFunc -DataTable $dt

Write-DataTable -ServerInstance "Win7boot\Sql1" -Database "hsg" -TableName "diskspaceFunc" -Data $dt

invoke-sqlcmd2 -ServerInstance "Win7boot\Sql1" -Database "hsg" -Query "SELECT * FROM diskspaceFunc" | Out-GridView

The following image shows the resulting table in Grid view.

Image of table

The observant reader will notice that I didn't write the information to a CSV file. Instead, I created an in-memory data table that is stored in my $dt variable. This is because by using this approach, there was not a need to create a CSV file, but for completeness let's apply the solution to our CSV loading use case:

. .\out-datatable.ps1

. .\Add-SqlTable.ps1

. .\write-datatable.ps1

. .\Invoke-SqlCmd2.ps1

./get-diskspaceusage.ps1 | export-csv -Path "C:\Users\Public\diskspace.csv" -NoTypeInformation -Force

$dt = Import-Csv -Path "C:\Users\Public\diskspace.csv" | Out-DataTable

Add-SqlTable -ServerInstance "Win7boot\Sql1" -Database "hsg" -TableName diskspaceFunc -DataTable $dt

Write-DataTable -ServerInstance "Win7boot\Sql1" -Database "hsg" -TableName "diskspaceFunc" -Data $dt

This post demonstrated three approaches to loading CSV files into tables in SQL Server by using a scripted approach. The approaches range from using the very simple T-SQL BULK INSERT command, to using LogParser, to using a Windows PowerShell function-based approach.

Thank you, Chad, for sharing this information with us. It looks like your last four scripts have the makings of an awesome NetAdminCSV module.

I invite you to follow me on Twitter and Facebook. If you have any questions, send email to me at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.

Ed Wilson, Microsoft Scripting Guy

 

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
  • Hi Chad,

    this is very interesting and quite a bit surprising, that we can feed CSV files that easily to SQL Server!

    The bulk load is my favorite solution while the four PS scripts seem be the most inetersting way to solve the problem with poweshell.

    Working with Oracle DBs is quite similar. We have Oracle loader, sql plus or external tables that could easily be used to fulffill the same tasks. Building powershell scripts might as well lead to a similar solution ...

    Anyway: Powershell is the glue used to combine "differnt worlds" in a very convenient way!

    Klaus.

  • I really dont understand the purpose of doing this when we have SSIS. SSIS is very efficient at pulling in CSV files. It basically builds the Bulk Copy for you and is much more simple. This also allows you to import the SSIS pack into mssql and run it from agent.  

  • @Charlie

    I think you misunderstand the capabilities of SSIS.  It cannot easily load any file from anywhere at any time.  It is a server-side process an subject to the restrictions on the server.  

    Client side bulk loading is necessary for maintenance and and-hoc loads.  It runs in the context of the user account and machine.

    For recurring loads where we do have access to the input files we can use SSIS.  

    SSIS is never considered a replace meant for bulk loading.  Mostly we use BCP which can do all of the above but having an integrated PowerShell method is also very useful at times.

  • @Klaus

    It could be easier, I wish BULK INSERT had a text delimiter parameter.

    BTW -- If you haven't checked it out SQLPSX http://sqlpsx.codeplex.com has a OracleBulkCopy function in the OracleClient module.

  • @Charlie

    SSIS is a wonderfully featured tool that's great for ETL, but I'm not sure SSIS is well-suited for scripted-based adhoc CSV file loading. You can't create an SSIS package easily from the command-line and if  your CSV file changes structure, for example if we run export-csv on various Powershell commands each time we would need to create a new SSIS package to define the new structure using a GUI tool. SSIS really doesn't like it when we change meta data. In the latter two approaches (LogParser and Powershell functions), we can automatically create and import any CSV file.

  • Easy way to do it

    sqlcopy.blogspot.in/.../bulk-sql-to-sql-sql-to-csv-csv-to-sql.html

  • sdasdad

  • machi this is very interesting and quite a bit surprising, that we can feed CSV files that easily to SQL Server!

    The bulk load is my favorite solution while the four PS scripts seem be the most inetersting way to solve the problem with poweshell.