I get annoyed when I'm talking with someone and they keep repeating themselves. They state a fact, I nod in confirmation, they talk about something else and then state the same fact again. I nod again to be polite but a little less politely. They state the same fact again and my filter gets weak. At that point, I might tell them they've already told me that twice already.  These people need to practice the DRY principle! Does this happen to you?

I'm anal about efficiency and I'm not talking about in my code.  I can't stand doing something twice in the physical or virtual world. Why else do you think this blog got its name from?

Repeating yourself is fruitless labor; a time drain on the already short time we have on this earth. If I had enough money I'd completely automate my house. I'd automate taking out the trash every week, putting dishes in the dishwasher and automate my shower! I know. I'm nuts.

This OCD attitude means I've embraced the DRY principle.

The DRY principle is a software development term that applies to all programming languages. The DRY principle has a single purpose; write code that outputs a desired outcome and never do it again.

If you notice an instance in your code you begin to repeat yourself,  don't rewrite the code. Don't (heaven forbid) copy and paste the same code to do it again. Reference your original code! Use your original code as a pointer.

Don't repeat yourself!

Repeating yourself makes your code inefficient, ugly and hard to manage.  My eyes bleed when I see something like this.

[powershell]
$TextFile1 = 'textfile3.txt'
$TextFile2 = 'textfile2.txt'
$TextFile3 = 'textfile1.txt'

## Get the content of all the text files
$Content1 = Get-Content $TextFile1
$Content2 = Get-Content $TextFile2
$Content3 = Get-Content $TextFile3

## Merge all text file content together to create a single text file
$AppendedContent = $Content1 + $Content2 + $Content3
$AppendedContent | Add-Content 'C:\mergedtextcontent.txt'
[/powershell]

If you see code like this, here are a few no-nos you can point out to the author since you've never written any code like this before, right?

Note: I'll be using the PowerShell language as a reference to code here. But know that PowerShell is, by no means, the only language to use the DRY principle in.

What's wrong here?

It's Not Reusable

Let's say you've got hundreds or thousands of text files to combine.  Perhaps you wrote this code a few months ago and it worked for for a small set of text files. But, your boss comes to you now with a few thousand. Great.

You can't take this code and add a few thousand lines to make it work for more files. You need to find another way to handle that situation. You'd need to write another script and throw this one away. If this were written following the DRY principle you could have passed a parameter to your old script and be done with it.

It's Not Extensible

Additional functionality can be bolted onto well-written scripts.  Scripts should be built like blocks.

  • Don't write a script that's a fully-developed solution.
  • Write scripts to do one thing and one thing only.
  • Allow yourself or someone else to build upon it in an easy-to-understand manner.

This is why I hate to see people use Format-Table inside of a script. rather than using this at the console when it's called.

Format-Table is an output-only cmdlet.  Once it's called, you're finishing off and telling everyone that no more blocks will be added!

Let's say I want to check to see if these text files exist before I tried to use Get-Content. How would I do that in this example? I'd have to repeat the same pattern of doing the same thing for every, single file.

$TextFile1 = 'textfile3.txt'
$TextFile2 = 'textfile2.txt'

## Test to ensure the file is there
if (!(Test-Path $TextFile1)) {
    Write-Output "The $TextFile1 file isn't there"
} else {
    ## Get the content of all the text files
    $Content1 = Get-Content $TextFile1
}
if (!(Test-Path $TextFile2)) {
    Write-Output "The $TextFile2 file isn't there"
} else {
    ## Get the content of all the text files
    $Content2 = Get-Content $TextFile2
}

You'd have no hair and a huge knot on your forehead from pulling your hair out and beating your head on the desk attempting to get this to work.

It's Not Readable

This code may make sense to you but what if others need to use it? They don't know what your pattern is here.

They're going to have to look down through each line. That's a waste of time.

For all they know, you've got some funky exception in there somewhere for that one file they need to know about.  It'd take them forever to understand the code.

Occam's Razor - The best solution is always the simplest

You've got 10 lines of code here that will have to be understood, maintained and possibly added upon. Common sense tells you if you can do the same thing (combining text files into 1) with 2-3 lines of code that'd be better, right?

When presented with this question my own mother could answer that one correctly. She doesn't even need to understand anything about PowerShell.

How can it be made better?

Notice patterns in the code and simplify them

In this example, the input is three text files and I decided to create three variables. I have three Get-Content lines and a single line appending them all together.  I've got some commonalities in the code.

I'm inherently saying each text file is in the same directory since I'm not specifying a directory.  I don't know if these are the only files in the directory but let's say they are.

Instead of defining each file on it's own, why not "get" them all with a single line?

$MyTextFiles = Get-ChildItem 'C:\MyTextFiles'

I've now "gotten" all of my three text files in a variable in a single line instead of three for a savings of two lines! w00t!

Next, I'm noticing another commonality of using Get-Content.

Since I want to read the contents of each file PowerShell has the pipeline.  I can use the pipeline to pipe the output of Get-ChildItem (which is the files themselves) into the input of Get-Content directly in the same line of code!

$MyTextFilesContent = Get-ChildItem 'C:\MyTextFiles' | Get-Content   

Now, not only did I remove another three lines of code from the script (Readability), I made it more Extensible and Reusable. It now doesn't matter how many text files there are in the directory. It will get the contents of all them.  I can now use this to get three text files or 1,000.

Combine like elements in arrays or hashtables

In the previous example, I assumed all these files were in the same directory. Let's say they're spread across a volume in different folders. Perhaps there's not good way to use Get-ChildItem to find them. It's not as good but the next best bet would be to put them in an array like this:

@('C:\MyTextFiles\textfile3.txt','C:\SomeOtherFolder\textfile2.txt','C:\Windows\textfile1.txt')

You're still having to define them but they are at least grouped together in an array.

Once in an array you can then perform common functions on the entire array (Get-Content in this case) like this:

@('C:\MyTextFiles\textfile3.txt','C:\SomeOtherFolder\textfile2.txt','C:\Windows\textfile1.txt') | Get-Content

I couldn't cut down on the three lines defining the text files. But at least I removed the three Get-Content lines to perform a common function on all the text files.

Use functions

In this example, functions would be overkill. But functions should be par for the course when writing more complex scripts.

Functions are great for grouping common code elements together.

Coming from the previous example of grouping text files, let's say you wanted to also first test to ensure the text files are there.  You'd need to because if you aren't using Get-ChildItem you don't really know.  You want to use Test-Path to make sure they're there before you try to use Get-Content.  Here's a way you could do this with a function.

function Get-MyContent ($FilePath) {
    if (!(Test-Path $FilePath)) {
        $false
    } else {
        Get-Content $FilePath
    }
}

$TextFiles = @('C:\MyTextFiles\textfile3.txt','C:\SomeOtherFolder\textfile2.txt','C:\Windows\textfile1.txt')
$CombinedOutput = ''
foreach ($TextFile in $TextFiles) {
    $Content = Get-MyContent -FilePath $TextFile
    if ($Content) {
        $CombinedContent += $Content
    }
}

This would ensure no errors are thrown when you try to get the contents of each file.  A function would be overkill in this example but do you see how it opens up possibilities for code reuse and extensibility?  For example, let's say you want to only get text file contents that contain a certain string.  You can now put that in your function and it would be ran for each text file.

Summary

I hope this beginner lesson helped you out.  I know when I was first starting out I would write scripts that make my eyes bleed now just because I didn't know what else I could do.  Not only did this blog post show you a few things Powershell can do I hope that it also opened your eyes a little bit into the general methodologies of efficient coding as well.