Indexes and loops
Remember indexes? Use a variable to refer to another piece of data. For example:
Cells(row + 3, column + 2)
You've seen loops that change numbers, like this:
Put them together. Have a loop change a variable that's used as an index.
Suppose you had this:
Here's code that runs down rows 1 to 5, showing the data, one cell at a time.
What if you had 500 rows? One change:
Using loops and indexes is the key to data analysis. The code in the loop body handles one cell at a time. The loop repeats that code for 5 pieces of data, or 500, or 50,000, however much you have.
Showing 500 MsgBox
es isn't very useful. Let's write a better program. You have 500 pieces of data, and you want to count the number of negative values.
How do you work with large data sets? A loop that changes a variable (row
), and the variable is used as an index (Cells(row, 1)
).
For loops
Using loops to mess with indexes is so common that there's a special kind of loop for it: a For
loop. The general form of a For
loop is:
For [variable] = [start] To [end]
[code]
Next [variable]
Here's an example:
The first time the CPU hits line 4, it sets row
to 1. It runs the loop body, with row
as 1. The CPU gets to Next
, which tells it to move on to the next value for Row
. The CPU sets row
to 2. It runs the loop body, with row
as 2. Then the CPU sets row
to 3. It runs the loop body, with row
as 3. Then the CPU sets row
to 4. It runs the loop body, with row
as 4. Then the CPU sets row
to 5. It runs the loop body, with row
as 5. row
has reached the end value, so the CPU skips to the code after the loop.
row
is called the loop variable. It's the variable that the CPU is changing each time.
Earlier you saw code that would count negative numbers:
Here's the same thing, with a For loop:
A little simpler.
The variable negatives
is an accumulator, that is, a variable that tracks some overall value of a data set. You'll see more of them later.
Another program, that accumulates data in a string:
Here's the result:
Notice that you can name the looping variable anything you want. i
, row
, daraObrian
, whatevs. Of course, meaningful variable names are best.
Painting cells for debugging
You love the debugger, right? If you haven't been using it, you've been spending too much time on your code.
There are other ways to debug as well. Say you're writing a program that uses a loop and an index to work through a bunch of cells. But you aren't sure if your program is using the right cells. How to check?
One way is to change the background color of a cell as it is processed. Then you can see which cells your program is working on.
Here's code that counts negatives, with one new line:
The test in line 6 checks Cells(row, 1)
, with row
starting at 1, and going up to 5. Is that line checking the right cells?
Line 5 changes the background color of cell (row
, 1) to yellow. That let's you see which cells are processed. Here's what you would see when the program is done:
The program is checking the cells we wanted it to.
To remove all the cell coloring, select all the cells by clicking in the top left of the cells area:
Right-click on a cell, and set the background color to No fill:
Click on any cell, and you'll see that the background color is gone.
Exercises
Don't forget the debugger! It's your BFF.
Write a program to count the number of times someone typed "dog". Or "DOG", or "Dog", or "Dog ", or " dOg ". In other words, can doesn't matter, and you need to get rid of spaces.
The Run button has been created for you, and some of the code written (the easy stuff). Complete the program.
Use a For loop.
Upload your Excel file.
(If you were logged in as a student, you could submit an exercise solution, and get some feedback.)
Summary
Using loops and indexes is the key to data analysis. The code in the loop body handles one cell at a time. The loop repeats that code for 5 pieces of data, or 500, or 50,000, however much you have.
Using loops to mess with indexes is so common that there's a special kind of loop for it: a For
loop.