Copy CSV Columns to an Excel Spreadsheet by Using PowerShell

Copy CSV Columns to an Excel Spreadsheet by Using PowerShell

  • Comments 11
  • Likes


Summary: Join the Microsoft Scripting Guys as they show you how to copy CSV columns to a Microsoft Excel spreadsheet by using Windows PowerShell.

 

Hey, Scripting Guy! Question

Hey, Scripting Guy! I love comma-separated value (CSV) files. To me, it seems as if that was the best way to store data. It is almost universally accepted and is still widely used. The files are small and simple to manipulate. I have enjoyed using Windows PowerShell to do text manipulation. And working with a CSV file is, at its core, simple text manipulation.

One reason I like CSV files is that I can easily work with them on server products. And I do not have to eat up several gigabytes worth of space to install Microsoft Office, which is not supported on a server operating system anyway. However, for all its grace and elegance, I will admit that reading a CSV file with the human eye is not the easiest process. For that I prefer to use Microsoft Excel. Though it is true that I can open a CSV file in Microsoft Excel, it is rather annoying when it comes to saving the file. There must be a dozen warnings and prompts I have to navigate. When I have to work on a large number of CSV files, this process becomes loathsome. I would love to be able to copy columns from a CSV file and save the data into an honest to goodness Microsoft Excel spreadsheet and avoid all the prompts. Can this be done?

-- JM

 

Hey, Scripting Guy! Answer Hello JM,

Microsoft Scripting Guy Ed Wilson here. Well, for the first time in more than four months, I have the windows open and the air conditioner turned off. The leaves have not started changing colors yet, and the temperature at night is only mid 60s Fahrenheit (18 degrees Celsius), but it is a giant step in the right direction. Soon, American football will begin, and there will be another sport I can ignore while I sit quietly beside the fireplace and read Shakespeare.

JM, email that is sent to scripter@microsoft.com is not something I can ignore. I wrote the Import-ProcessDataToExcel.ps1 script to illustrate importing specific columns of data into a newly created Microsoft Excel spreadsheet. The complete Import-ProcessDataToExcel.ps1 script is shown here.

Import-ProcessDataToExcel.ps1

Param(
 
$csvFile = "C:\fso\process.csv",
 
$path = "C:\fso\process.xlsx"
)
$processes = Import-Csv -Path $csvFile
$Excel = New-Object -ComObject excel.application
$Excel.visible = $false
$workbook = $Excel.workbooks.add()
$excel.cells.item(1,1) = "name"
$excel.cells.item(1,2) = "Virtual Memory"
$excel.cells.item(1,3) = "Working Set"
$excel.cells.item(1,4) = "Private Memory"
$excel.cells.item(1,5) = "Non-paged memory"
$i = 2
foreach($process in $processes)
{
 
$excel.cells.item($i,1) = $process.name
 
$excel.cells.item($i,2) = $process.vm
 
$excel.cells.item($i,3) = $process.ws
 
$excel.cells.item($i,4) = $process.pm
 
$excel.cells.item($i,5) = $process.npm
 
$i++
} #end foreach process
$workbook.saveas($path)
$Excel.Quit()
Remove-Variable -Name excel
[gc]::collect()
[gc]::WaitForPendingFinalizers()

To create an interesting CSV file to work with for today’s Hey, Scripting Guy! Blog post, I used Windows PowerShell and I exported process information. I used the Get-Process cmdlet to retrieve information about all the processes running on my computer. I then piped the object to the Export-Csv cmdlet. If you are not planning on reconstituting the object, but instead you want to create a traditional CSV file, it is important to include the notypeinformation switch. Without using this switch, type information is written to the first line of the CSV file, and it will confuse things. An example of type information is shown here:

#TYPE System.Diagnostics.Process

Here is the command I used:

Get-Process | Export-Csv -Path C:\fso\process.csv -NoTypeInformation

The resultant CSV file is shown in the following image. JM, I think you were being kind when you said reading the file is not the easiest. To me it looks like a Rorschach test. I think I see a Scripting Guy eating an ANZAC biscuit while drinking a cup of Earl Grey tea. I wonder what that means? Oh, yeah, it must be time for my afternoon snack.

Image of CSV file

The Import-ProcessDataToExcel.ps1 script begins by creating two input parameters. The first parameter is the path to the saved CSV file, and the second parameter is the path for the newly created Microsoft Excel spreadsheet. This is shown here:

Param(

 
$csvFile = "C:\fso\process.csv",

 
$path = "C:\fso\process.xlsx"

)

 

One thing to keep in mind about this script—I do not do a check for an existing spreadsheet with the same name in the same location. To do this, you could use the Test-Path cmdlet to see if the file exists, and if it does not exist, use the saveas method. Otherwise, you might delete the file and then call saveas. Alternatively, you might want to update the file by using the save method. As you can tell, there are a number of decisions you might need to make depending on the behavior you want. If the file does exist and the saveas method is called, you will be prompted to overwrite the file.

The Import-ProcessDataToExcel.ps1 script uses the Import-Csv cmdlet to import the data from the CSV file. The imported data is stored in the $processes variable. This is shown here:

$processes = Import-Csv -Path $csvFile

Nearly all of the time when you are working with Microsoft Excel, you will need to create an instance of the excel.application object. There are exceptions, such as Hey, Scripting Guy! How Can I Read from Excel Without Using Excel? article or Hey, Scripting Guy! How Can I Write to Excel Without Using Excel?, but those are rare cases. After the application object has been created, set the visible property to false, and add a new workbook to the workbooks collection. This is shown here:

$Excel = New-Object -ComObject excel.application

$Excel.visible = $false

$workbook = $Excel.workbooks.add()

 

I then decide to add column headings to the first row. To do this, I use the cells collection from the application object. This is a bit of an unusual move because usually the cells collection is accessed from the worksheet object. The cells collection from the application object returns a range object that represents all the cells in the active worksheet. The active worksheet in our example is sheet1 in the newly added workbook. By accessing the cells collection from the application object, we avoid having to create a worksheet object. There is a risk in this technique in that if the active document is not a worksheet, the cells property call will fail, which would cause the entire script to not work. But in this example, there is little danger because as the script is written, the active document must be a worksheet. The first number used by the item method represents the row, and the second number represents the column. Here are the column heads:

$excel.cells.item(1,1) = "name"

$excel.cells.item(1,2) = "Virtual Memory"

$excel.cells.item(1,3) = "Working Set"

$excel.cells.item(1,4) = "Private Memory"

$excel.cells.item(1,5) = "Non-paged memory"

 

We are fished with the first row; therefore, the value of the $i variable (that will be used later to keep track of our row number) is set to 2. The foreach statement is used to walk through the collection of processes that are stored in the $processes variable. This portion of the script is shown here:

$i = 2

foreach($process in $processes)

{

 

The nice thing about working with CSV data that has been imported via the Import-Csv cmdlet is that it allows you to address the data by column name. This means the script is easier to read. To add data to the Microsoft Excel spreadsheet, use the item method from the cells collection, specify the location on the spreadsheet, and assign the data. By using the $i variable to keep track of the current row and assigning the other static columns, the following code is achieved:

$excel.cells.item($i,1) = $process.name

$excel.cells.item($i,2) = $process.vm

$excel.cells.item($i,3) = $process.ws

$excel.cells.item($i,4) = $process.pm

$excel.cells.item($i,5) = $process.npm

 

The ++ operator is used to increment the value of the $i variable by 1. The first time through the loop, the value of $i is equal to 2 because it was set prior to entering the foreach loop. The next time through, the value will be 3 and so on. The ++ operator works the same as saying $i = $i + 1. This is shown here:

PS C:\> $i = 2

PS C:\> $i++

PS C:\> $i

3

PS C:\> $a = 2

PS C:\> $a = $a + 1

PS C:\> $a

3

PS C:\>

 

The saveas method from the workbook object automatically creates the file if it does not already exist. If the file does exist, it prompts to overwrite the file. After the file is saved, the quit method from the application object is used to exit Microsoft Excel. The $excel variable is then deleted, and the garbage collection is forced by calling the collect method from the GC class. The WaitForPendingFinalizers Method causes the script to wait for all finalizers to complete before continuing.

The completed Microsoft Excel workbook is shown in the following image.

Image of completed Excel workbook

JM, that is all there is to using Windows PowerShell to create a Microsoft Excel workbook from previously existing CSV data. This also concludes Microsoft Excel Week. Join us tomorrow for Quick-Hits Friday.

We invite you to follow us on Twitter and Facebook. If you have any questions, send email to us at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.

 

Ed Wilson and Craig Liebendorfer, Scripting Guys

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
  • Do you think there's a method to do something similar but for an Access/SQL table instead?

  • Just wondered  what would be the best way to get rid of the Excel COM object. The Windows PowerShell Tip of the Week  "Getting Rid of a COM Object (Once and For All)" at technet.microsoft.com/.../ff730962.aspx recommends: [System.Runtime.Interopservices.Marshal]::ReleaseComObject($Excel)

  • I tried a variation of this, and on the

    $workbook = excel.workbooks.add()

    line I got an error about an expression expected after the left parenthesis.  

    Where is the documentation on these various methods for excel?

    Thank you.

  • Pardon my premature question.  The script error was my own fault (no dollar sign before excel.workbooks.add) but the question seems reasonable, namely, is there documentation on all these methods.  I do see the examples in the script repository, but that is not the same as a comprehensive list of all the things one can do with/to an excel spreadsheet.

    Thank you.

  • This post would have been considerably more useful if it was generalized. Everything is hard-coded. Can we not get and print the column names dynamically; deal with an arbitrary number of columns; pass the paths to the script, etc?

  • @Matt B. you can use ADO to write to Access, or to SQL. See the ADO tag on this blog, or the Scripting for databases on the Learn to script page.

    @WF probably a good thing to do.

    @Steven White MSDN is the comprenhensive source of information about everything you can do with Microsoft Excel. See the hyperlinks in the article for guidance.

    @WhoEver abstraction is always a good thing, but generally makes the script more complex to understand. Feel free to modify the script and post it to the Repository.

  • This works for me, but can end up being pretty long.  On my server, it takes approx 0.04 seconds per imported cell.  Do the math, but with a 10,000 rows spreadsheet with 86 columns, it takes over 9 hours to complete!  Any suggestions to make it faster? Maybe something else than a "foreach" loop... ?

  • What is not made clear here is that there is no need to import a CSV into an Excel spreadsheet.  a CSV file will open directly in Excel and can be easily saved in many formats.  

    This is very fast.

    The only reason to do an import is if you want to edit the values or filter the import.  This too may be easier after just opening the CSV in Excel directly.

  • @jrv:  the CSV data need to be pushed into a XLSX file where many graphs and pivot tables depend on this data.  Note that this process will be automated (unsupervised) and will either do daily, weekly, or monthly data concatenation/handling.  I've found another method here: gallery.technet.microsoft.com/.../d41565f1-37ef-43cb-9462-a08cd5a610e2 but my powershell skills are way to low to make it work... when i pipe the csv file it only loads the filename lenght or each line byte count... i'm desperate now... help!

  • @JB

    You need to use Excel directly.  It is not necessary to import.  YOur master sheet can read data directly from a CSV file.

    There are a number of places where you can get more help on this.

    You can post in an Excel forum at Microsoft:social.technet.microsoft.com/.../threads

    You can post at The Scripting Guys Forum at Microsoft: social.technet.microsoft.com/.../ITC

    You can post at: www.scriptinganswers.com

    Each forum has its own particular strengths and weaknesses.

  • @jrv:   I solved my problem!

    Here's the code, where I copy the content of "input.csv" in the "specific_tab" of the "output.xlsx" file.  In this example, there is only 2 columns in input.csv (if more are present, change the "A1:B1" portion of this code).  Note that excel turns the csv filename into the tab name automatically, make sure that this line "$Worksheet = $Workbook.WorkSheets.item("input")" is adjusted.

    $sourcefile = "C:\input.csv"

    $updatedfile = "C:\output.xlsx"

    $Excel = New-Object -ComObject excel.application

    $Excel.visible = $false

    $Workbook = $excel.Workbooks.open($sourcefile)

    $Worksheet = $Workbook.WorkSheets.item("input")

    $worksheet.activate()  

    $range = $WorkSheet.Range("A1:B1").EntireColumn

    $range.Copy() | out-null

    $Workbook = $excel.Workbooks.open($updatedfile)

    $Worksheet = $Workbook.WorkSheets.item("specific_tab")

    $worksheet.activate()  

    $Range = $Worksheet.Range("A1")

    $Worksheet.Paste($range)  

    $workbook.Save()  

    $Excel.Quit()

    Remove-Variable -Name excel

    [gc]::collect()

    [gc]::WaitForPendingFinalizers()