Veritas Vineyards
Veritas is a vineyard in Michigan. It only sells casks of wine.
Veritas has three grades of wine. In order of quality and price, they are table (cheap, urgh), event, and celebration (expensive, yummy). There are three sizes of casks: petite, standard, and grand. There are four delivery regions. In order of delivery charge: pickup (at the vineyard – there is no delivery fee), Michigan, midwest, and other.
We want a price calculator, like this:
Assume that the user doesn't make any typing mistakes. We'll add validation later.
Let's watch Simon write a price calculation program. First, let's fill his brain with knowledge about subs. Open the box of subs…
Now cut open Simon's head…
Drop in the subs knowledge module… glue his head together… there.
BTW, don't try this on your own head. You have to learn it the long way.
cmdRun_Click
is the code that's run when the user clicks the Run button. It's the program's top level code. Top level code is often called the main program.
The main program calls getInput()
, that is, tells it to run. The main program is the caller.
Memory spaces, variables, and parameters
Programming is all about the variables. Subs are all about variables, too. Remember that a variable is really a piece of memory.
Each sub gets its own memory space. It's where the sub stores its variables.
Here's the memory spaces created by Simon's code:
The memory spaces are separate. That's important. It helps with thunking.
The variables are declared inside the main program, so we have:
getInput()
cannot access the main program's variables, not directly. But how are the two going to exchange data? That's where parameters come in.
Parameters let the caller give a sub access to variables in its own memory space. Use parameters, and you're telling VBA to set up shared memory between a sub and its caller.
Suppose getInput
is like this:
Sub getInput(quality As String, size as String...)
If size = "standard" Then
...
size
isn't in getInput()
's memory space, not really. Instead, it's a reference to the caller's memory space. When the sub getInput()
messes with size
, it's really messing with the main program's memory.
The caller gets to decide which variables will be parameters.
Variables that aren't shared as parameters are kept in their own memory spaces. Both memory spaces can have a variable called count
, and there won't be any problems. The count
s could have different data types, and mean different things. Maybe count
in the main program is a String
describing this guy:
Maybe count
in getInput()
is an Integer
, the number of humps a dromedary has.
Doesn't matter. They're in different memory spaces, and won't step on each other.
Main programs coordinate
Here are Simon's calls and parameters:
The main program is coordinating the subs. It asks getInput()
to fill the variables quality
, size
, and region
. The main program doesn't care how getInput()
does that. It just wants some data.
Then the main program calls computePrice()
, sending it quality
, size
, and region
, and asking it to fill price
. Again, the main program doesn't know how computePrice()
does its job.
Finally, the main program calls output()
(that's a valid sub name), and sends it price
.
This is a common programming pattern.
You'll hear programmers talk about a sub's signature. That's this part:
getInput(quality As String, size As String, region As String)
The name of the sub, and its parameters, are the sub's signature. Each parameter has a data type, remember.
Each sub does its own part
More code. Start with the input sub. Its job is to get values for quality
, size
, and region
from somewhere. When we're writing getInput()
, we just think about getting the data. No need to take up brain space thinking about processing or output.
That last sentence is important. Remember, we want to make programming easier for our brains. Easy is good. Subs help, because they let our brains focus on one piece of the program at a time.
getInput()
doesn't do much in this case. When we add validation, it will do more.
Let's move on to computePrice()
. It takes quality
, size
, and (shipping) region
, and works out price
. We don't need to use brain space thinking about where the input data came from, or what's going to happen to price
once we've worked it out. computePrice()
just does one part of the task.
Variables declared inside a sub are called local variables. Like pricePerLiter
. It's used inside computePrice()
to work out price
. Once price
is worked out, we don't need pricePerLiter
. When a sub ends, its local variables are erased. So, pricePerLiter
goes away when the sub computePrice()
has finished.
Look at the second to last line:
price = pricePerLiter * volume + deliveryCharge * volume
When it runs the program, Excel puts a number into the variable price
. Recall that a variable is a named piece of memory. Where is that memory?
price
is declared in the main program's memory space. That's what the Dim
statement does:
Excel initializes Single
s to zero.
The main program passes price
to computePrice()
as a parameter. So price
in computePrice()
is actually a reference to price
in the main program's memory space.
Suppose that in…
… pricePerLiter * volume + deliveryCharge * volume
works out to 53. computePrice()
puts 53 into price
. It's putting that value into the main program's memory space:
When computePrice()
exits (at the End Sub
), the main program will still have 53 in price
. It can then pass that to output()
:
price
is a parameter for output()
, so it's really a reference to price
in the main program).
So what?
Focus on one chunk at a time
computePrice()
is the most complicated part of the program. When Simon is writing it, he doesn't have to think about getInput()
and output()
. The less that Simon has to have in his brain, the easier it will be to write the code.
Less screaming.
Testing and debugging is easier
Suppose the output is wrong.
Simon can put a breakpoint here:
He can check the values of quality
, size
, and region
. If one of those is wrong, the problem is in getInput()
. If they're OK and price
is wrong, the problem is in computePrice()
. If they're all OK, the problem is in output()
.
Thunking with subs helps you find bugs more quickly. Less screaming.
Coordinating a programming team is easier
Say you're working on a programming project with two other humans. You break up the job into pieces. One human will do the input, another human the processing, and you the output. Say you do the entire program in one big blob, without subs, then copy-and-paste the code all together.
You krucknuckle! You were supposed to input an
Integer
, not aString
! What is my processing code going to do with aString
?
Hey! I'm using the variable
count
to count the input records! You can't use it in the output! You krucknuckle!
I can't check the data during processing! You should have added validation to the input! You're such a krucknuckle!
Now, suppose you did it differently. First, you get together and write the main program, and the signatures of each sub (that's the sub's name and parameters). For the wine program, that would be:
You do that together, so you all understand the overall structure of the program. Then you split up to do your own chunks, each one a sub. When you copy-and-paste the subs together, there'll be no problems with data types; they're part of the signature. All of you can use the variable count
in your subs, and they won't interfere with each other.
Less screaming.
Changing chunks is easier
The people at the vineyard like the workbook so much that they want Simon to make version that outputs to a Web page. Like this:
Web pages are in the markup language HTML. Feed this HTML to a Web browser, and you get the output above:
<h2>
means heading level 2, <p>
means paragraph, and <hr>
means horizontal rule. If we can write that HTML to a file, then a Web browser will read the file and show the output above. So, we need to change the program to create the HTML, and write it to a file.
More later.
If his code were all in one VBA blob, Simon would have to be careful that his new Web output code didn't conflict with the input and processing code.
But Simon thunked it good, so it's not a problem. Here's his code for the main program.
The only change is to send quality
, size
, and region
to output()
, so they can be written to the Web page.
getInput()
doesn't change, because the input is coming from the same place. Simon doesn't need to think about it. Woohoo!
computePrice()
doesn't change. The calculation is the same. Woohoo 2!
Here's his new output code:
Recall that &
is concatenate, also known as stick-strings-together. After the HTML has been put into message
, the code opens a file in the same folder as the workbook, adds the HTML to it, and closes the file.
Simon didn't even think about the rest of the program. Easier on the brain. Less screaming.
Summary
Programming is all about the variables. Subs are all about variables, too.
Each sub gets its own memory space. Parameters let the caller give a sub access to variables in its own memory space.
A common pattern is for the main program to just coordinate subs. It's groovy.
Reasons to thunk with subs:
- Focus on one chunk at a time, so your brain isn't as full.
- Testing and debugging is easier.
- Coordinating a programming team is easier.
- Changing chunks is easier.
Less screaming.