Indexes and loops are BFFs

Keywords: 

Indexes and loops

Remember indexes? Use a variable to refer to another piece of data. For example:

Cells(row + 3, column + 2)

You've seen loops that change numbers, like this:

  1. Dim x As Integer
  2. x = 1
  3. Do
  4.     MsgBox "The square of " & x & " is " & x * x
  5.     x = x + 1
  6. Loop While x <= 5
Put them together. Have a loop change a variable that's used as an index.

Suppose you had this:

Data

Here's code that runs down rows 1 to 5, showing the data, one cell at a time.

  1. Dim row As Integer
  2. row = 1
  3. Do
  4.     MsgBox "Row " & row & ", column 1 is " & Cells(row, 1)
  5.     row = row + 1
  6. Loop While row <= 5
What if you had 500 rows? One change:

  1. Dim row As Integer
  2. row = 1
  3. Do
  4.     MsgBox "Row " & row & ", column 1 is " & Cells(row, 1)
  5.     row = row + 1
  6. Loop While row <= 500
Using loops and indexes is the key to data analysis. The code in the loop body handles one cell at a time. The loop repeats that code for 5 pieces of data, or 500, or 50,000, however much you have.

Showing 500 MsgBoxes isn't very useful. Let's write a better program. You have 500 pieces of data, and you want to count the number of negative values.

  1. Dim row As Integer
  2. Dim negatives As Integer
  3. negatives = 0
  4. row = 1
  5. Do
  6.     If Cells(row, 1) < 0 Then
  7.         negatives = negatives + 1
  8.     End If
  9.     row = row + 1
  10. Loop While row <= 500
  11. MsgBox "There are " & negatives & " negatives."
How do you work with large data sets? A loop that changes a variable (row), and the variable is used as an index (Cells(row, 1)).

For loops

Using loops to mess with indexes is so common that there's a special kind of loop for it: a For loop. The general form of a For loop is:

For [variable] = [start] To [end]
    [code]
Next [variable]

Here's an example:

  1. Dim row As Integer
  2. For row = 1 To 5
  3.     MsgBox Cells(row, 1)
  4. Next row
The first time the CPU hits line 4, it sets row to 1. It runs the loop body, with row as 1. The CPU gets to Next, which tells it to move on to the next value for Row. The CPU sets row to 2. It runs the loop body, with row as 2. Then the CPU sets row to 3. It runs the loop body, with row as 3. Then the CPU sets row to 4. It runs the loop body, with row as 4. Then the CPU sets row to 5. It runs the loop body, with row as 5. row has reached the end value, so the CPU skips to the code after the loop.

For loop

row is called the loop variable. It's the variable that the CPU is changing each time.

Earlier you saw code that would count negative numbers:

  1. Dim row As Integer
  2. Dim negatives As Integer
  3. negatives = 0
  4. row = 1
  5. Do
  6.     If Cells(row, 1) < 0 Then
  7.         negatives = negatives + 1
  8.     End If
  9.     row = row + 1
  10. Loop While row <= 500
  11. MsgBox "There are " & negatives & " negatives."
Here's the same thing, with a For loop:

  1. Dim row As Integer
  2. Dim negatives As Integer
  3. negatives = 0
  4. For row = 1 To 500
  5.     If Cells(row, 1) < 0 Then
  6.         negatives = negatives + 1
  7.     End If
  8. Next row
  9. MsgBox "There are " & negatives & " negatives."
A little simpler.

The variable negatives is an accumulator, that is, a variable that tracks some overall value of a data set. You'll see more of them later.

Another program, that accumulates data in a string:

  1. Dim i As Integer
  2. Dim message As String
  3. message = "Numbers: "
  4. For i = 1 To 5
  5.     message = message & i & " "
  6. Next i
  7. MsgBox message
Here's the result:

Output

Notice that you can name the looping variable anything you want. i, row, daraObrian, whatevs. Of course, meaningful variable names are best.

Painting cells for debugging

You love the debugger, right? If you haven't been using it, you've been spending too much time on your code.

There are other ways to debug as well. Say you're writing a program that uses a loop and an index to work through a bunch of cells. But you aren't sure if your program is using the right cells. How to check?

One way is to change the background color of a cell as it is processed. Then you can see which cells your program is working on.

Here's code that counts negatives, with one new line:

  1. Dim row As Integer
  2. Dim negatives As Integer
  3. negatives = 0
  4. For row = 1 To 5
  5.     Cells(row, 1).Interior.Color = vbYellow
  6.     If Cells(row, 1) < 0 Then
  7.         negatives = negatives + 1
  8.     End If
  9. Next row
  10. MsgBox "There are " & negatives & " negatives."
The test in line 6 checks Cells(row, 1), with row starting at 1, and going up to 5. Is that line checking the right cells?

Line 5 changes the background color of cell (row, 1) to yellow. That let's you see which cells are processed. Here's what you would see when the program is done:

Cells processed

The program is checking the cells we wanted it to.

To remove all the cell coloring, select all the cells by clicking in the top left of the cells area:

Select all cells

Right-click on a cell, and set the background color to No fill:

No fill

Click on any cell, and you'll see that the background color is gone.

Exercises

Don't forget the debugger! It's your BFF.

Exercise: Dog count
408 people filled in a Web survey, typing in their favorite pet. Dog, cat, bird, whatevs. The data was put into a spreadsheet you can download. Here's part of it:

Data

Write a program to count the number of times someone typed "dog". Or "DOG", or "Dog", or "Dog ", or " dOg ". In other words, can doesn't matter, and you need to get rid of spaces.

The Run button has been created for you, and some of the code written (the easy stuff). Complete the program.

Use a For loop.

Upload your Excel file.

Attachments

dog-count-start.xlsm

(If you were logged in as a student, you could submit an exercise solution, and get some feedback.)

Summary

Using loops and indexes is the key to data analysis. The code in the loop body handles one cell at a time. The loop repeats that code for 5 pieces of data, or 500, or 50,000, however much you have.

Using loops to mess with indexes is so common that there's a special kind of loop for it: a For loop.