Easily Remove Columns from a CSV File by Using PowerShell

Easily Remove Columns from a CSV File by Using PowerShell

  • Comments 11
  • Likes

Summary: Use a super simple, one-line command to remove columns easily from a CSV file using Windows PowerShell.

 

Hey, Scripting Guy! QuestionHey, Scripting Guy! I have a CSV file from which I need only two of eight columns. Can Windows PowerShell delete certain columns? I have searched the Internet for days with no luck.

—RK

 

Hey, Scripting Guy! AnswerHello RK,

Microsoft Scripting Guy Ed Wilson here. The Scripting Wife and I are absolutely enjoying Canada. The classes I have been teaching are populated with extremely bright students who have asked engaging questions. I love teaching because I get lots of good questions that force me to spend time learning new things about Windows PowerShell. Great questions are also one of the reasons I like reading the email sent to scripter@microsoft.com.

RK, your question jumped out at me today. The reason is that at first I thought it would be a pain to answer because I would need to parse through the file, and I hate parsing text. But then, I thought about it some more. A comma-separated value (CSV) file is structure text—it is not plain text. In addition, Windows PowerShell has several great cmdlets for working with CSV files. I then wondered if, after I modify the stream, I can pipe it back out to another CSV file?

I have a CSV file with process IDs, process names, and CPU time. I can use the Import-CSV cmdlet to read a CSV file and to display the contents on the Windows PowerShell console. This is shown in the following figure.

Image of reading CSV file with Import-CSV cmdlet

The text file itself contains the same information. The file is shown in the following figure.

Image of text file with same information as in previous image

It is obvious the problem with the data stored in this CSV file is that it is missing the CPU column for about half of the processes. This can be a distraction, so I want to create a new CSV file that contains the process name and the process ID information. To do this manually would be a major pain. To write a VBScript to do this would be an even bigger pain. But doing this with Windows PowerShell is surprisingly easy. The secret does not lie in either the Import-CSV cmdlet that creates an object from the CSV file, or in the Export-CSV cmdlet that writes the information to a CSV file. The secret is the Select-Object cmdlet. The cool thing about the Select-Object cmdlet is that it creates a custom object from the piped input.

Custom objects are cool. In fact, the Import-CSV cmdlet itself creates a custom object if no type information exists in the file. The following output illustrates this fact:

PS C:\> Import-Csv C:\fso\co.csv | Get-Member

 

 

   TypeName: System.Management.Automation.PSCustomObject

 

Name                                       MemberType                             Definition

Equals                                       Method                                     bool Equals(System.Object obj)

GetHashCode                            Method                                     int GetHashCode()

GetType                                    Method                                     type GetType()

ToString                                    Method                                     string ToString()

CPU                                          NoteProperty                            System.String CPU=

Id                                             NoteProperty                            System.String Id=2008

ProcessName                             NoteProperty                            System.String ProcessName=AEADISRV

 

When I use the Select-Object cmdlet to choose the processname and the id properties, I still have a custom object, but this time it only contains two properties, as shown in this output:

PS C:\> Import-Csv C:\fso\co.csv | select processname,id | get-member

 

 

   TypeName: Selected.System.Management.Automation.PSCustomObject

 

Name                           MemberType                                         Definition

Equals                           Method                                                 bool Equals(System.Object obj)

GetHashCode                Method                                                 int GetHashCode()

GetType                        Method                                                 type GetType()

ToString                        Method                                                 string ToString()

Id                                 NoteProperty                                        System.String Id=2008

ProcessName                 NoteProperty                                        System.String ProcessName=AEADISRV

 

Because the Export-CSV cmdlet accepts piped input, all I need to do is pipe the custom object from the Select-Object cmdlet (select is an alias) to the Export-CSV cmdlet. The command is shown here:

Import-Csv C:\fso\co.csv | select processname,id | Export-Csv -Path c:\fso\modco.csv –NoTypeInformation

When I open the newly created CSV file in Notepad, I can see that the command worked. I no longer have any CPU time information recorded in the file. The file is shown in the following figure.

Image of proof that command worked

 

RK, that is all there is to using Windows PowerShell to remove columns from a CSV file. The secret lies in the way that Windows PowerShell pipes objects. And when working with objects, the easiest way to create a custom object is to use the Select-Object cmdlet. In fact, if you have been following the 2011 wrap-up articles, you will see that I was recommending using Select-Object to return an object from a function instead of using Format-Table or some other format type of cmdlet.

Join me tomorrow for more cool Windows PowerShell stuff. See you here!

I invite you to follow me on Twitter and Facebook. If you have any questions, send email to me at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.

Ed Wilson, Microsoft Scripting Guy

 

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
  • Hey Ed,

    a wonderful deal to delete CSV columns with powershell....

    I will think of it, ... maybe it will change the way i did it before:

    opening Excel,

    clicking the column headers

    and deleting the columns is not that much fun :-)))

    Klaus

  • @Klaus Shulte I was really supprised at how easy it was to delete columns in a CSV file. When I read the email from the reader, I thought, OH NO, that is going to be painful, all that text parsing ... then I realized, wait, it is an object ... it is easy.

  • Thanks, Ed!  This is one of those that has eluded me for too long for no good reason!

  • @David Caudill I am glad you like it and find it useful. It is a cool technique. Thanks for letting me know you liked it.

  • How would you remove an empty column while importing a csv file. I have an empty column in the csv file and when I try to import the csv file I get an error. Please help. Thanks Scripting Guy.

    -MN

  • Dear Sir,

    Thank you very much for this really useful post. Could you please let me know how can I delete the first row of a csv file which is generally the heading line ?

  • Hi Ed

    Appreciate the information.

    Is it possible to do the same for multiple files with no headers.

    To elaborate, Delete Columns 4,5,6 & 7 from all .csv files in a specific directory and the files do not have a header row.

    Thanks in advance.

  • Great solution. Sometimes you tend to overthink these things.

  • This is AWESOME !!! Saved me lot of time, thanks !

  • what we should do if the title has space like "process Id"

  • and how we can delete 1 column instead of selecting the one that we want, for example if I have 10 columns how to delete 1 instead of selecting 9