Get Excel data without Excel

Get Excel data without Excel

  • Comments 1
  • Likes

In a specific scenario, I needed to be able to query an Excel file and get data from a worksheet but I didn’t (and couldn’t) have the Office suite (or even Excel only) installed on the machine.

If the Excel file is and xlsx file (the new Microsoft Open XML Format since Excel 2007), you could extract the xml content (e.g. xl\worksheets\sheet1.xml) from the archive (the xlsx file is actually a compressed archive), read it as an xmlDocument, and so on... But it is not really that straight forward… I’ll leave it for a future post.

What I ended up doing, is using the ACE.OLEDB provider to query the Excel worksheet as if it were a SQL table. The only problem is, that you can call it only from 32bit applications. Don’t worry, you don’t have to explicitly run the script from a 32bit PowerShell console or ISE. I wrapped the code that queries the Excel as a scriptblock and I invoke it as a 32bit job using the -RunAs32 switch (see Get-Help about_Jobs or Get-Help Start-Job for more details)

The complete PowerShell example function is:

function Get-ExcelData {
    [CmdletBinding(DefaultParameterSetName='Worksheet')]
    Param(
        [Parameter(Mandatory=$true, Position=0)]
        [String] $Path,

        [Parameter(Position=1, ParameterSetName='Worksheet')]
        [String] $WorksheetName = 'Sheet1',

        [Parameter(Position=1, ParameterSetName='Query')]
        [String] $Query = 'SELECT * FROM [Sheet1$]'
    )

    switch ($pscmdlet.ParameterSetName) {
        'Worksheet' {
            $Query = 'SELECT * FROM [{0}$]' -f $WorksheetName
            break
        }
        'Query' {
            # Make sure the query is in the correct syntax (e.g. 'SELECT * FROM [SheetName$]')
            $Pattern = '.*from\b\s*(?<Table>\w+).*'
            if($Query -match $Pattern) {
                $Query = $Query -replace $Matches.Table, ('[{0}$]' -f $Matches.Table)
            }
        }
    }

    # Create the scriptblock to run in a job
    $JobCode = {
        Param($Path, $Query)

        # Check if the file is XLS or XLSX 
        if ((Get-Item -Path $Path).Extension -eq 'xls') {
            $Provider = 'Microsoft.Jet.OLEDB.4.0'
            $ExtendedProperties = 'Excel 8.0;HDR=YES;IMEX=1'
        } else {
            $Provider = 'Microsoft.ACE.OLEDB.12.0'
            $ExtendedProperties = 'Excel 12.0;HDR=YES'
        }
        
        # Build the connection string and connection object
        $ConnectionString = 'Provider={0};Data Source={1};Extended Properties="{2}"' -f $Provider, $Path, $ExtendedProperties
        $Connection = New-Object System.Data.OleDb.OleDbConnection $ConnectionString

        try {
            # Open the connection to the file, and fill the datatable
            $Connection.Open()
            $Adapter = New-Object -TypeName System.Data.OleDb.OleDbDataAdapter $Query, $Connection
            $DataTable = New-Object System.Data.DataTable
            $Adapter.Fill($DataTable) | Out-Null
        }
        catch {
            # something went wrong :-(
            Write-Error $_.Exception.Message
        }
        finally {
            # Close the connection
            if ($Connection.State -eq 'Open') {
                $Connection.Close()
            }
        }

        # Return the results as an array
        return ,$DataTable
    }

    # Run the code in a 32bit job, since the provider is 32bit only
    $job = Start-Job $JobCode -RunAs32 -ArgumentList $Path, $Query
    $job | Wait-Job | Receive-Job
    Remove-Job $job
}

Then, you can use it to get the entire default worksheet (Sheet1):

Get-ExcelData -Path C:\myFiles\Users.xlsx

Or to get a specific worksheet:

Get-ExcelData -Path C:\myFiles\Users.xlsx -WorksheetName 'Sheet2'

Or by specifying a query:

Get-ExcelData -Path C:\myFiles\Users.xlsx -Query 'SELECT TOP 3 * FROM Sheet3'

Or an even more complex query:

Get-ExcelData -Path C:\myFiles\Users.xlsx -Query "SELECT GivenName, Surname, City, State FROM Sheet1 WHERE State in ('CA','WA')"

 

HTH,

/Martin.

Comments
Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment