Learn about Windows PowerShell
Summary: Quit using deprecated JET drivers to talk to Microsoft Access and Excel. Learn how to use Windows PowerShell and ACE Drivers.
Hey, Scripting Guy! How do I get data out of Microsoft Access and Excel files and into SQL Server by using Windows PowerShell?
—MC
Hello MC, Microsoft Scripting Guy, Ed Wilson, is here. We are at the center of SQL Week in honor of SQLRally next week, and our guest blogger today is Chad Miller.
Chad Miller (Blog|Twitter) is a SQL Server DBA and the senior manager of database administration at Raymond James Financial. In his spare time, he is the project coordinator and developer of the CodePlex project SQL Server PowerShell Extensions (SQLPSX). Chad leads the Tampa Powershell User Group and is a frequent speaker at IT Camps, SQL Saturdays, and Code Camps.
Microsoft Access and Excel have been called the desktop database, and companies have a lot of data stored in them. It is inventible that you will need to load some of these files into a SQL Server database.
Note: There are a variety of tools provided by Microsoft to assist in migrating your data into a SQL Server database. In addition Windows Azure, Microsoft Access Services, and Microsoft Excel Services for SharePoint provide alternative solutions, which should be considered as possible strategies. Other solutions may be better suited than the Windows PowerShell scripts demonstrated in this blog post.
Connectivity to a data source is accomplished through ODBC drivers or OLE DB Providers. When we connect to a data source, we first need to determine which connectivity components to use. A Windows operating system includes the ODBC driver and OLE DB Providers for connecting to a variety of data sources, including SQL Server. However, newer Microsoft Office connectivity components are not included with the operating system. Instead, Microsoft Office connectivity components are installed with Microsoft Office or alternatively through something called the Access Control Entry (ACE) driver.
One of the cool things about ACE is first, its name. But more importantly, ACE is completely free, and it even includes a 64-bit version.
Important: ensure that you have the ACE driver/provider installed on your machine or else the Windows PowerShell scripts demonstrated in this post will not work! Before you proceed, go to Microsoft Access Database Engine 2010 Redistributable, and download AccessDatabaseEngine.exe or AccessDatabaseEngine_x64.exe, depending on your operating system. The ACE drivers are supported by Windows 7; Windows Server 2003 R2, 32-bit x86; Windows Server 2003 R2, x64 editions; Windows Server 2008 R2; Windows Server 2008 with Service Pack 2; Windows Vista with Service Pack 1; and Windows XP with Service Pack 3.
When you you have ACE drivers, there is no reason to use the old deprecated JET drivers—even for older versions of Microsoft Access and Excel. A common mistake I see, even with seasoned developers, is to drop to JET for .mdb and .xls files when you don’t need to. I have made this mistake myself past.
I found a helpful blog post on MSDN from the CSS SQL Server Engineers that talks about different data providers and discusses a migration strategy.
Having addressed the issue of connectivity, we can now look at a simple Windows PowerShell module that provides a function over the ACE driver, which I call ACE.
To install the ACE module, perform the following steps.
To use the module, you need to import the module into your current Windows PowerShell session. This is accomplished by running the following command.
Import-module ace
The ace module consists of one exported function called Get-ACEData. To see Help including examples of usage, run the following command.
Help Get-ACEData –full
Now that the ACE module is installed and ready to use, let us look at couple of examples.
I am going to use an Excel file called backup.xlsx as shown here.
First, let us list the worksheets that are part of the backup.xlsx spreadsheet.
PS C:\Users\u00\bin> Get-ACEData -FilePath .\backupset.xlsx -TableListOnly TABLE_CATALOG : TABLE_SCHEMA : TABLE_NAME : Sheet1$ TABLE_TYPE : TABLE TABLE_GUID : DESCRIPTION : TABLE_PROPID : DATE_CREATED : 4/19/2011 9:41:11 PM DATE_MODIFIED : 4/19/2011 9:41:11 PM TABLE_CATALOG : TABLE_SCHEMA : TABLE_NAME : Sheet2$ TABLE_TYPE : TABLE TABLE_GUID : DESCRIPTION : TABLE_PROPID : DATE_CREATED : 4/19/2011 9:41:11 PM DATE_MODIFIED : 4/19/2011 9:41:11 PM TABLE_CATALOG : TABLE_SCHEMA : TABLE_NAME : Sheet3$ …
PS C:\Users\u00\bin> Get-ACEData -FilePath .\backupset.xlsx -TableListOnly
TABLE_CATALOG :
TABLE_SCHEMA :
TABLE_NAME : Sheet1$
TABLE_TYPE : TABLE
TABLE_GUID :
DESCRIPTION :
TABLE_PROPID :
DATE_CREATED : 4/19/2011 9:41:11 PM
DATE_MODIFIED : 4/19/2011 9:41:11 PM
TABLE_NAME : Sheet2$
TABLE_NAME : Sheet3$
…
To shorten the output, I will pipe the results to Select-Object and specify the TABLE_NAME property only.
PS C:\Users\u00\bin> Get-ACEData -FilePath .\backupset.xlsx -TableListOnly | Select-Object TABLE_NAME TABLE_NAME ---------- Sheet1$ Sheet2$ Sheet3$
PS C:\Users\u00\bin> Get-ACEData -FilePath .\backupset.xlsx -TableListOnly | Select-Object TABLE_NAME
TABLE_NAME
----------
Sheet1$
Sheet2$
We can see that the three worksheets that make up the spreadsheet are returned. Next, let us return the data. To return all data from every worksheet, run the following command.
$ds= Get-ACEData -FilePath .\backupset.xlsx
If we look at the $ds variable through Get-Member, we see that the function returns the DataSet shown here.
This DataSet is made up of a collection of DataTables. To see how many DataTables, run the following command.
$ds.Tables.count 3
$ds.Tables.count
3
To see the DataTable names, run the following command.
$ds.Tables | Select TableName TableName --------- Sheet1$ Sheet2$ Sheet3$
$ds.Tables | Select TableName
TableName
---------
Sheet3$
If we want look at the data, we would run:
$ds.Tables
To see only the first DataTable, specify the collection index or the name of the DataTable, for example:
$ds.Tables[0] or $ds.Tables[“Sheet1$”]
If you want to return only the first and second worksheets, we can specify either the Table or the Query parameter as shown here.
$ds= Get-ACEData -FilePath .\backupset.xlsx -Table 'Sheet1$','Sheet2$'
To restrict the rows that are returned from a worksheet, use the Query parameter as shown here.
$ds = Get-ACEData -FilePath .\backupset.xlsx -Query "Select * FROM [Sheet1$] WHERE [dbname] = 'db1'"
The ACE driver supports .xls (Excel 2003) and . xlsx (Excel 2007 and higher) files.
Next, we will query a Microsoft Access file.
I am going to use the sample Northwind Microsoft Access database, which is available from File>>New>>Sample Templates as shown in the following image.
Just as I did with the Excel file, we will get a list of tables. However, Microsoft Access has different table types, including views and system tables, which we’ll need to filter out and load into a variable as shown here.
$tables = Get-ACEData -FilePath .\Northwind.accdb -TableListOnly | where {$_.TABLE_TYPE -eq 'TABLE' } | Select -ExpandProperty TABLE_NAME $tables Customers Employee Privileges Employees Inventory Transaction Types Inventory Transactions Invoices Order Details Order Details Status Orders Orders Status Orders Tax Status Privileges Products Purchase Order Details Purchase Order Status Purchase Orders Sales Reports Shippers Strings Suppliers
$tables = Get-ACEData -FilePath .\Northwind.accdb -TableListOnly | where {$_.TABLE_TYPE -eq 'TABLE' } | Select -ExpandProperty TABLE_NAME
$tables
Customers
Employee Privileges
Employees
Inventory Transaction Types
Inventory Transactions
Invoices
Order Details
Order Details Status
Orders
Orders Status
Orders Tax Status
Privileges
Products
Purchase Order Details
Purchase Order Status
Purchase Orders
Sales Reports
Shippers
Strings
Suppliers
To load all of the regular tables contained in the Northwind Microsoft Access file, run the following command.
$ds = Get-ACEData -FilePath .\Northwind.accdb -Table $tables
To see the number of tables in the DataSet, run:
$ds.Tables.Count 20
$ds.Tables.Count
Like in the Excel example, we can specify queries by using the Query parameter or specific tables with the Table parameter. The ACE Driver supports .mdb (Access 2003) and .accdb (Access 2007 or higher) formats.
At this point, we have retrieved data from Excel and Access files. Next we will look at loading this data into a SQL Server database. I am going to use the free, SQL Server Express version. If you do not have a SQL Server instance available, you can download and install SQL Server Express. I am also going create a database called hsgb in SQL Server Management Studio as shown in the following image.
In a previous Hey! Scripting Guy blog titled Use PowerShell to Collect Server Data and Write to SQL, I demonstrated three functions that I will use to load and query the data in SQL Server. Download each script from the following locations:
Next, source each function in Windows PowerShell as follows (note that this is dot space dot forward slash).
. ./Invoke-SqlCmd2 . ./Write-DataTable . ./Add-SqlTable
. ./Invoke-SqlCmd2
. ./Write-DataTable
. ./Add-SqlTable
We are ready to import our Excel and Access data.
To import our Excel data, run the following command.
$ds= Get-ACEData -FilePath .\backupset.xlsx -Table 'Sheet1$' Add-SqlTable -ServerInstance "Z003\SQLEXPRESS" -Database hsgb -TableName $($ds.Tables[0].TableName) -DataTable $ds.Tables[0] Write-DataTable -ServerInstance "Z003\SQLEXPRESS" -Database hsgb -TableName $($ds.Tables[0].TableName) -Data $ds.Tables[0]
$ds= Get-ACEData -FilePath .\backupset.xlsx -Table 'Sheet1$'
Add-SqlTable -ServerInstance "Z003\SQLEXPRESS" -Database hsgb -TableName $($ds.Tables[0].TableName) -DataTable $ds.Tables[0]
Write-DataTable -ServerInstance "Z003\SQLEXPRESS" -Database hsgb -TableName $($ds.Tables[0].TableName) -Data $ds.Tables[0]
By running Invoke-SqlCmd2, we can see our newly inserted data as shown here.
Invoke-Sqlcmd2 -ServerInstance "Z003\SQLEXPRESS" -Database hsgb -Query "select * from [sheet1$]" | Out-GridView
To import a group of Microsoft Access tables, run the following command.
$ds = Get-ACEData -FilePath .\Northwind.accdb -Table Customers,Suppliers $ds.Tables | foreach {Add-SqlTable -ServerInstance "Z003\SQLEXPRESS" -Database hsgb -TableName $($_.TableName) -DataTable $_ } $ds.Tables | foreach {Write-DataTable -ServerInstance "Z003\SQLEXPRESS" -Database hsgb -TableName $($_.TableName) -Data $_ }
$ds = Get-ACEData -FilePath .\Northwind.accdb -Table Customers,Suppliers
$ds.Tables | foreach {Add-SqlTable -ServerInstance "Z003\SQLEXPRESS" -Database hsgb -TableName $($_.TableName) -DataTable $_ }
$ds.Tables | foreach {Write-DataTable -ServerInstance "Z003\SQLEXPRESS" -Database hsgb -TableName $($_.TableName) -Data $_ }
By running Invoke-SqlCmd2, we can view the data in our new SQL Server tables as shown in the following examples.
Invoke-Sqlcmd2 -ServerInstance "Z003\SQLEXPRESS" -Database hsgb -Query "select * from customers" | Out-GridView
Invoke-Sqlcmd2 -ServerInstance "Z003\SQLEXPRESS" -Database hsgb -Query "select * from suppliers" | Out-GridView
This blog demonstrated how to query and load the Microsoft Excel and Access files into a SQL Server table. The functions Get-ACEData, Invoke-SqlCmd2, Write-DataTable, and Add-SqlTable can be used to easily automate your Excel and Access data loads.
MC, that is all there is to using Windows PowerShell and ACE Drivers to get data out of Microsoft Access and Excel files and into SQL Server. SQL Week will continue tomorrow when guest blogger Michael Wells will talk about how to automatically create tempdb files in SQL Server.
Thank you, Chad, for an awesome blog post.
I invite you to follow me on Twitter and Facebook. If you have any questions, send email to me at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.
Ed Wilson, Microsoft Scripting Guy
Hello Chad,
GREAT! And just the stuff i would definitely use each day from my office!
I'm SO sorry, that I can't test anything @work because we have STILL !!! Office 2000 !!!
and Oracle (mostly).
My co-workers still use a lot of Excel data. And some Access DBs, too.
Strange enough most of the data lives in some Oracle DBs but of course never in the same
kind of representation that they carry around, update, read, write and import/export on their own.
So there is aways reason enough to be able to get the back into a DB, update other data sources and
transfer them as newly exported excel sheets to other users. I get my daily Excel spreadsheet via email
and not seldom, I have to get some data back into our DBs. The Powershell approach would be of great help
in many occasions, if you don't want to end up in a "copy&paste editor orgy" to send a bunch of nearly
handcrafted SQL commands to a DB server :-(((
I hope, I can use it somehow even with the old office version and oracle ...
kind regards,
Klaus (Schulte)
Klaus,
I haven't tested the ACE drivers with Office 2000, only 2003, 2007 and 2010. I'm not sure if they work against 2000 or not. If you do test with an Office 2000 XLS file, let us know your results.
For loading data into Oracle, one possible solution is to convert the XLS file to CSV and then use Oracle's SQL Loader utility (or external tables). Here's an example of converting the XLS file to CSV using PowerShell:
import-module ace
$ds= Get-ACEData -FilePath .\backupset.xls -Table 'Sheet1$'
$ds.Tables[0] | Export-Csv ./backupset.csv -NoTypeInformation
any Set-ACEData command available with ACE?
@AP
Set is slightly more difficult. To update existing Excel spreadsheets and MS Access files, I treat them like databases and use ADO.NET. I did presentation about this for the South Florida IT Camp 2011. The slides and scripts demonstrate updating Access and Excel in PowerShell using the ACE driver and ADO.NET directly. You can download presentation/scripts here: sev17.com/.../south-florida-it-camp-presentation
I'm using the module to pull out a dataset from an XLS and finding that some cells from the source are inexplicably missing from the dataset.
$ds= Get-ACEData -FilePath $FilePath
([xml]$ds.getxml()).save("D:\Users\Me\Desktop\output.xml")
In this case, column 13, 'Business Zip' is missing:
...<F11>Business City</F11> <F12>Business State</F12> <F14>Business Country</F14>...
Are there any tricks to debug this to figure out what's going on?
Thanks