Better thunking

Veritas Vineyards

Veritas is a vineyard in Michigan. It only sells casks of wine.

Veritas has three grades of wine. In order of quality and price, they are table (cheap, urgh), event, and celebration (expensive, yummy). There are three sizes of casks: petite, standard, and grand. There are four delivery regions. In order of delivery charge: pickup (at the vineyard – there is no delivery fee), Michigan, midwest, and other.

We want a price calculator, like this:

Price calculator

Assume that the user doesn't make any typing mistakes. We'll add validation later.

Let's watch Simon write a price calculation program. First, let's fill his brain with knowledge about subs. Open the box of subs…

Subs box

Now cut open Simon's head…

Open head

Drop in the subs knowledge module… glue his head together… there.

BTW, don't try this on your own head. You have to learn it the long way.

Simon
Simon
Let's see. Looks like an IPO again. Three chunks.

  1. Private Sub cmdRun_Click()
  2.     'Input
  3.     'Processing
  4.     'Output
  5. End Sub

cmdRun_Click is the code that's run when the user clicks the Run button. It's the program's top level code. Top level code is often called the main program.

Simon
Simon
I'll start with the input chunk. Make a sub for it.

  1. Private Sub cmdRun_Click()
  2.     'Input
  3.     getInput
  4.     'Processing
  5.     'Output
  6. End Sub
  7. 'Input
  8. Sub getInput()
  9. End Sub

The main program calls getInput(), that is, tells it to run. The main program is the caller.

Memory spaces, variables, and parameters

Programming is all about the variables. Subs are all about variables, too. Remember that a variable is really a piece of memory.

Each sub gets its own memory space. It's where the sub stores its variables.

Here's the memory spaces created by Simon's code:

Memory spaces

The memory spaces are separate. That's important. It helps with thunking.

Simon
Simon
Hmm… The program needs three pieces of data in variables. The wine quality, cask size, and delivery region. I'll declare the variables in the main program.

  1. Private Sub cmdRun_Click()
  2.     Dim quality As String
  3.     Dim size As String
  4.     Dim region As String
  5. ...
The variables are declared inside the main program, so we have:

Memory spaces with variables

getInput() cannot access the main program's variables, not directly. But how are the two going to exchange data? That's where parameters come in.

  1. Private Sub cmdRun_Click()
  2.     Dim quality As String
  3.     Dim size As String
  4.     Dim region As String
  5.     'Input
  6.     'Processing
  7.     getInput quality, size, region
  8.     'Output
  9. End Sub
  10. 'Input
  11. Sub getInput(quality As String, size As String, region As String)
  12. End Sub
Parameters let the caller give a sub access to variables in its own memory space. Use parameters, and you're telling VBA to set up shared memory between a sub and its caller.

Parameters

Suppose getInput is like this:

Sub getInput(quality As String, size as String...)
    If size = "standard" Then
        ...

size isn't in getInput()'s memory space, not really. Instead, it's a reference to the caller's memory space. When the sub getInput() messes with size, it's really messing with the main program's memory.

The caller gets to decide which variables will be parameters.

Locals

Variables that aren't shared as parameters are kept in their own memory spaces. Both memory spaces can have a variable called count, and there won't be any problems. The counts could have different data types, and mean different things. Maybe count in the main program is a String describing this guy:

Count

Maybe count in getInput() is an Integer, the number of humps a dromedary has.

Doesn't matter. They're in different memory spaces, and won't step on each other.

Main programs coordinate

Here are Simon's calls and parameters:

  1. Option Explicit
  2. Private Sub cmdRun_Click()
  3.     Dim quality As String
  4.     Dim size As String
  5.     Dim region As String
  6.     Dim price As Single
  7.     'Input
  8.     getInput quality, size, region
  9.     'Processing
  10.     computePrice quality, size, region, price
  11.     'Output
  12.     output price
  13. End Sub
  14. 'Input
  15. Sub getInput(quality As String, size As String, region As String)
  16. ...
  17. End Sub
  18. 'Processing
  19. Sub computePrice(quality As String, size As String, region As String, price As Single)
  20. ...
  21. End Sub
  22. 'Output
  23. Sub output(price As Single)
  24. ...
  25. End Sub
The main program is coordinating the subs. It asks getInput() to fill the variables quality, size, and region. The main program doesn't care how getInput() does that. It just wants some data.

Then the main program calls computePrice(), sending it quality, size, and region, and asking it to fill price. Again, the main program doesn't know how computePrice() does its job.

Finally, the main program calls output() (that's a valid sub name), and sends it price.

This is a common programming pattern.

Pattern

Main program calls subs

Situation:
You have a relatively complex program. It's too big to think about all at once.
Actions:
Break the program into subs. The main program coordinates passing data between the subs.

You'll hear programmers talk about a sub's signature. That's this part:

getInput(quality As String, size As String, region As String)

The name of the sub, and its parameters, are the sub's signature. Each parameter has a data type, remember.

Each sub does its own part

More code. Start with the input sub. Its job is to get values for quality, size, and region from somewhere. When we're writing getInput(), we just think about getting the data. No need to take up brain space thinking about processing or output.

That last sentence is important. Remember, we want to make programming easier for our brains. Easy is good. Subs help, because they let our brains focus on one piece of the program at a time.

  1. 'Input
  2. Sub getInput(quality As String, size As String, region As String)
  3.     quality = Cells(5, 2)
  4.     size = Cells(8, 2)
  5.     region = Cells(11, 2)
  6. End Sub
getInput() doesn't do much in this case. When we add validation, it will do more.

Let's move on to computePrice(). It takes quality, size, and (shipping) region, and works out price. We don't need to use brain space thinking about where the input data came from, or what's going to happen to price once we've worked it out. computePrice() just does one part of the task.

  1. 'Processing
  2. Sub computePrice(quality As String, size As String, region As String, price As Single)
  3.     Dim pricePerLiter As Single
  4.     Dim volume As Integer
  5.     Dim deliveryCharge As Single
  6.     'Compute price per liter
  7.     If quality = "table" Then
  8.         pricePerLiter = 8.5
  9.     ElseIf quality = "event" Then
  10.         pricePerLiter = 12.8
  11.     Else
  12.         pricePerLiter = 22.9
  13.     End If
  14.     'Compute volume in liters
  15.     If size = "petite" Then
  16.         volume = 4
  17.     ElseIf size = "standard" Then
  18.         volume = 8
  19.     Else
  20.         volume = 15
  21.     End If
  22.     'Compute delivery charge per liter
  23.     If region = "pickup" Then
  24.         deliveryCharge = 0
  25.     ElseIf region = "Michigan" Then
  26.         deliveryCharge = 1.2
  27.     ElseIf region = "midwest" Then
  28.         deliveryCharge = 4.55
  29.     Else
  30.         deliveryCharge = 8.22
  31.     End If
  32.     'Compute total
  33.     price = pricePerLiter * volume + deliveryCharge * volume
  34. End Sub
Variables declared inside a sub are called local variables. Like pricePerLiter. It's used inside computePrice() to work out price. Once price is worked out, we don't need pricePerLiter. When a sub ends, its local variables are erased. So, pricePerLiter goes away when the sub computePrice() has finished.

Look at the second to last line:

price = pricePerLiter * volume + deliveryCharge * volume

When it runs the program, Excel puts a number into the variable price. Recall that a variable is a named piece of memory. Where is that memory?

  1. Private Sub cmdRun_Click()
  2.     ...
  3.     Dim price As Single
  4.     ...
  5.     computePrice quality, size, region, price
  6.     ...
  7. End Sub
  8. 'Processing
  9. Sub computePrice(quality As String, size As String, region As String, price As Single)
  10.     ...
  11.     price = pricePerLiter * volume + deliveryCharge * volume
  12. End Sub
price is declared in the main program's memory space. That's what the Dim statement does:

Declaring price

Excel initializes Singles to zero.

The main program passes price to computePrice() as a parameter. So price in computePrice() is actually a reference to price in the main program's memory space.

Suppose that in…

  1. Sub computePrice(quality As String, size As String, region As String, price As Single)
  2.     ...
  3.     price = pricePerLiter * volume + deliveryCharge * volume
  4. End Sub
pricePerLiter * volume + deliveryCharge * volume works out to 53. computePrice() puts 53 into price. It's putting that value into the main program's memory space:

Price

When computePrice() exits (at the End Sub), the main program will still have 53 in price. It can then pass that to output():

  1. 'Output
  2. Sub output(price As Single)
  3.     Cells(17, 2) = price
  4. End Sub
price is a parameter for output(), so it's really a reference to price in the main program).

So what?

Focus on one chunk at a time

computePrice() is the most complicated part of the program. When Simon is writing it, he doesn't have to think about getInput() and output(). The less that Simon has to have in his brain, the easier it will be to write the code.

Less screaming.

Testing and debugging is easier

Suppose the output is wrong.

Simon
Simon
Mon dieu!

Simon can put a breakpoint here:

Debug

He can check the values of quality, size, and region. If one of those is wrong, the problem is in getInput(). If they're OK and price is wrong, the problem is in computePrice(). If they're all OK, the problem is in output().

Thunking with subs helps you find bugs more quickly. Less screaming.

Coordinating a programming team is easier

Say you're working on a programming project with two other humans. You break up the job into pieces. One human will do the input, another human the processing, and you the output. Say you do the entire program in one big blob, without subs, then copy-and-paste the code all together.

You krucknuckle! You were supposed to input an Integer, not a String! What is my processing code going to do with a String?

Hey! I'm using the variable count to count the input records! You can't use it in the output! You krucknuckle!

I can't check the data during processing! You should have added validation to the input! You're such a krucknuckle!

Now, suppose you did it differently. First, you get together and write the main program, and the signatures of each sub (that's the sub's name and parameters). For the wine program, that would be:

  1. Private Sub cmdRun_Click()
  2.     Dim quality As String
  3.     Dim size As String
  4.     Dim region As String
  5.     Dim price As Single
  6.     'Input
  7.     getInput quality, size, region
  8.     'Processing
  9.     computePrice quality, size, region, price
  10.     'Output
  11.     output price
  12. End Sub
  13. 'Input
  14. Sub getInput(quality As String, size As String, region As String)
  15. End Sub
  16. 'Processing
  17. Sub computePrice(quality As String, size As String, region As String, price As Single)
  18. End Sub
  19. 'Output
  20. Sub output(price As Single)
  21. End Sub
You do that together, so you all understand the overall structure of the program. Then you split up to do your own chunks, each one a sub. When you copy-and-paste the subs together, there'll be no problems with data types; they're part of the signature. All of you can use the variable count in your subs, and they won't interfere with each other.

Less screaming.

Changing chunks is easier

The people at the vineyard like the workbook so much that they want Simon to make version that outputs to a Web page. Like this:

Web output

Web pages are in the markup language HTML. Feed this HTML to a Web browser, and you get the output above:

  1. <h2>Wine Order Price</h2>
  2. <p>Quality: table</p>
  3. <p>Size: petite</p>
  4. <p>Region: pickup</p>
  5. <p>Price: 34</p>
  6. <hr>
<h2> means heading level 2, <p> means paragraph, and <hr> means horizontal rule. If we can write that HTML to a file, then a Web browser will read the file and show the output above. So, we need to change the program to create the HTML, and write it to a file.

More later.

If his code were all in one VBA blob, Simon would have to be careful that his new Web output code didn't conflict with the input and processing code.

But Simon thunked it good, so it's not a problem. Here's his code for the main program.

  1. Private Sub cmdRun_Click()
  2.     Dim quality As String
  3.     Dim size As String
  4.     Dim region As String
  5.     Dim price As Single
  6.     'Input
  7.     getInput quality, size, region
  8.     'Processing
  9.     computePrice quality, size, region, price
  10.     'Output
  11.     output quality, size, region, price
  12. End Sub
The only change is to send quality, size, and region to output(), so they can be written to the Web page.

getInput() doesn't change, because the input is coming from the same place. Simon doesn't need to think about it. Woohoo!

computePrice() doesn't change. The calculation is the same. Woohoo 2!

Here's his new output code:

  1. 'Output
  2. Sub output(quality As String, size As String, region As String, price As Single)
  3.     Dim message As String
  4.     'Build the HTML.
  5.     message = "<h2>Wine Order Price</h2>" & _
  6.         "<p>Quality: " & quality & "</p>" & _
  7.         "<p>Size: " & size & "</p>" & _
  8.         "<p>Region: " & region & "</p>" & _
  9.         "<p>Price: " & price & "</p>" & _
  10.         "<hr>"
  11.     'Append the HTML to the Web page.
  12.     Open ThisWorkbook.Path & "\wine-price.html" For Append As #1
  13.     Print #1, message
  14.     Close #1
  15. End Sub
Recall that & is concatenate, also known as stick-strings-together. After the HTML has been put into message, the code opens a file in the same folder as the workbook, adds the HTML to it, and closes the file.

Simon didn't even think about the rest of the program. Easier on the brain. Less screaming.

Summary

Programming is all about the variables. Subs are all about variables, too.

Each sub gets its own memory space. Parameters let the caller give a sub access to variables in its own memory space.

A common pattern is for the main program to just coordinate subs. It's groovy.

Reasons to thunk with subs:

  • Focus on one chunk at a time, so your brain isn't as full.
  • Testing and debugging is easier.
  • Coordinating a programming team is easier.
  • Changing chunks is easier.

Less screaming.