In this tutorial, we're diving into the essentials of handling CSV files using PowerShell, covering a variety of scenarios you might encounter in the real world. This guide will help sysadmins and IT professionals manage data effectively, demonstrating the power and flexibility of PowerShell in working with different types of CSV files.
To set the stage, let’s create a simple CSV file to work with.
@(
[pscustomobject]@{
FirstName = 'Joe'
LastName = 'Jones,Barrett'
Address = '453 1st St'
City = 'Phoenix'
State = 'AZ'
},
[pscustomobject]@{
FirstName = 'Jose'
LastName = 'Melindez'
Address = '5663 Main St'
City = 'Atlanta'
State = 'GA'
}
) | Export-Csv -Path ~Employees.csv
And read it with Get-Content.
Get-Content -Path ~Employees.csv
Nothing too fancy, just a CSV file with a couple of fictional employees. A few things to note that will come in handy later; notice this CSV file uses commas as field delimiters and quotes. The quotes enclose each field value, so you can even use the delimiter, a comma in this case, inside of the field value like the “Jones-Barrett” example. That’s one field.
Also, notice I used the Get-Content
cmdlet. This cmdlet isn’t smart; it reads a text file by default and barfs out the content as an array of strings. Let’s check out the type of object Get-Content
returns for each CSV row.
Get-Content -Path ~Employees.csv | Get-Member
You’ll see that it returns a simple string. Actually, it’s an array of strings. If I use the not too common unary operator to prevent PowerShell from unrolling the array….
,(Get-Content -Path ~Employees.csv) | Get-Member
You can see that Get-Content returns an array of objects which in this case are strings. It’s an array because of the left and right brackets.
If I look at a single object, you’ll see that it’s just the whole line of the CSV file, including the commas and quotes.
Get-Content -Path ~/Employees.csv | Select-Object -First 1
We want objects, not raw strings, for each CSV file line. I want objects with fields as object properties. To get that, I can use the Import-Csv
cmdlet.
Import-Csv -Path ~/Employees.csv
That looks much better. It has intelligently parsed the entire CSV file, converted the header row in the CSV file to object property names, and used the field values as property values. Let’s check the object type now.
Import-Csv -Path ~/Employees.csv | Get-Member
You can now see that the output object type is System.Management.Automation.PSCustomObject
; not just a string. The Import-Csv
cmdlet knew the CSV file’s structure and was able to convert it to a set of PSCustomObjects.
Here’s another type of CSV file you might run into.
Get-Content -Path ~Employees-withcommasnoquotes.csv
Notice the difference? No quotes and notice the second row Jones,Barrett
. That comma is actually part of Joe’s name. The value Barrett
is not his address.
FYI: I created that CSV file with this code:
@(
[pscustomobject]@{
FirstName = 'Joe'
LastName = 'Jones,Barrett'
Address = '453 1st St'
City = 'Phoenix'
State = 'AZ'
},
[pscustomobject]@{
FirstName = 'Jose'
LastName = 'Melindez'
Address = '5663 Main St'
City = 'Atlanta'
State = 'GA'
}
) | Export-Csv -Path ~Employees.csv
(Get-Content -Path ~Employees.csv).replace('"','') | Set-Content -Path ~Employees-withcommasnoquotes.csv
Let’s see how Import-Csv
handles this.
Import-Csv -Path ~Employees-withcommasnoquotes.csv
That’s not right. All of the field values to the right of the LastName
field are wrong. Import-Csv
is using that comma in his name as a field delimiter. And this is where the quotes come in. Quotes define the entirety of a field, the delimiter tells Import-Csv where a field ends and begins.
Let’s shift gears and now cover one scenario you may find yourself in. Recall that I mentioned earlier that a CSV should technically have fields delimited by commas. Well, sometimes you’ll find “CSV” files with different delimiters like tabs. By default, the Import-Csv
cmdlet doesn’t like that.
For example, I have a CSV file here…
Import-Csv ~/Employeeswithtabs.csv
Things don’t look right. Import-Csv
hasn’t transformed each line into an object. That’s because it was looking for a comma as a field delimiter. If I look at this file…
Get-Content ~/Employeeswithtabs.csv
You’ll see the fields are delimited by tabs and not commas. Luckily, there’s a quick fix. Simply use the Delimiter
parameter.
Import-Csv ~/Employeeswithtabs.csv -Delimiter "`t"
The output looks much better now. Import-Csv
is now looking for tabs as field delimiters instead of commas and is parsing the CSV file correctly.
To round out this demo, let’s now cover one more scenario; CSVs with no headers. Maybe you’ve been sent a CSV file from some ancient HR database with no headers that looks something like this.
Get-Content -Path ~Employees-noheaders.csv
When you attempt to read the CSV with Import-Csv
, this is what happens.
Import-Csv -Path ~Employees-noheaders.csv
Import-Csv
has no field names to associate with values, so it uses the top row as headers and makes them property names. This isn't right. You could modify the CSV and add headers that way, but I always prefer to leave the files unchanged as much as possible. For Import-Csv
to have valid headers, use the Header
parameter.
Import-Csv -Path ~Employees-noheaders.csv -Header 'FirstName','LastName','Address','City','State'
Much better! When you use the Header
parameter and specify a comma-delimited list, Import-Csv
uses that set of values from left to right as field names rather than the first row of the CSV file. This parameter allows you to parse CSV files with no headers without modifying the original file.
You should now be on your way to a much better experience reading CSV files with PowerShell!