Extracting to an array

Keywords: 

Extraction

Extracting data into a separate data set:

Sales

Makes it easier to write clean code, like this:

extractData dataOK
If Not dataOK Then
    End
End If
processData
outputResults

Code that's easier to test and debug. More reliable. Less screaming.

You've seen how to put the extracted data into a different part of a worksheet. But that's not the only place. You can also put extracted data into an array.

What's an array?

Arrays are a cross between variables and worksheets. They're variables, since they're in memory, like all variables. They're like worksheets, because they can have rows and columns.

Memory

Looks like rows is an Integer, although it could be a Single that happens to be a whole number. highest is a Single. animal and worst are Strings. animal is the best animal. worst is the worst Invader. Or the best, depending on your perspective.

grades and temperatures are arrays. grades stores students' grades on assignments. Like a regular worksheet, it has rows and columns. Just like a worksheet, you can use the rows and columns any way you like. The columns in grades are for assignments. The first assignment was out of 25, the second out of 10, the third out of 20, and the fourth out of 10.

There's one row in grades for each student. Row 1 is for the first student, row 2 for the second, and so on.

grades is a two-dimensional array. temperatures is a one-dimensional array. It's like a column from a worksheet.

Arrays are not as flexible as worksheets. In VBA arrays, every cell must be the same data type. They must all be Integers, Strings, whatevs. In a worksheet, you can have a string in one cell, an integer in another, no probs.

Declaring arrays

Examples of declaring arrays:

Dim sales(100) as Single
Dim words(2000) as String
Dim grades(100, 50) as Integer

The numbers are how many elements are in the array. So you can 100 elements in sales.

Actually, you can put 101, because sales(0) also exists. We'll pretend it doesn't.

Referring to array elements

You refer to array elements the same way you refer to cells. With indexes. For example:

s1 = sales(4)
day = 5
sales(day) = 55
student = 2
assignment = 1
MsgBox "Student " & student & " earned " & grades(student, assignment) & " on assignment " & assignment & "."

Let's analyze the last one. Here's the data again:

Memory

In grades, there's a row for each student, and a column for each assignment. We want to get student 2's score on assignment 1. That's row 2, column 1. So:

student = 2
assignment = 1
MsgBox "Student " & student & " earned " & grades(student, assignment) & " on assignment " & assignment & "."

If the grades data was stored in a worksheet, we'd have:

student = 2
assignment = 1
MsgBox "Student " & student & " earned " & Cells(student, assignment) & " on assignment " & assignment & "."

Almost identical. grades(student, assignment) instead of Cells(student, assignment).

Extracting to arrays

Let's say we have the sales data again:

Sales data

Here's the code we used to extract sales more that 250 to the worksheet:

  1. Private Sub cmdRun_Click()
  2.     Dim rowSource As Integer
  3.     Dim rowDestination as Integer
  4.     rowSource = 0
  5.     rowDestination = 0
  6.     Do While Cells(rowSource + 1, 1) <> ""
  7.         rowSource = rowSource + 1
  8.         'Is the data above 250?
  9.         If Cells(rowSource, 1) > 250 Then
  10.             'Copy value to destination.
  11.             rowDestination = rowDestination + 1
  12.             Cells(rowDestination, 3) = Cells(rowSource, 1)
  13.         End If
  14.     Loop
  15. End Sub
There are two index variables. rowSource is used to find the end of the source data. It's incremented each time through the loop (line 13). rowDestination is an index into the extracted data. It's incremented only when data is copied to the extracted data set (line 10).

Let's make a version of the code that puts the extracted data into an array.

  1. Private Sub cmdRun_Click()
  2.     Dim sales(500) As Integer
  3.     Dim rowSource As Integer
  4.     Dim rowDestination As Integer
  5.     rowSource = 0
  6.     rowDestination = 0
  7.     Do While Cells(rowSource + 1, 1) <> ""
  8.         rowSource = rowSource + 1
  9.         'Is the data above 250?
  10.         If Cells(rowSource, 1) > 250 Then
  11.             'Copy value to destination.
  12.             rowDestination = rowDestination + 1
  13.             sales(rowDestination) = Cells(rowSource, 1)
  14.         End If
  15.         rowSource = rowSource + 1
  16.     Loop
  17. End Sub
Hardly anything is different. Line 2 declares the array. Line 12 copies data into the array, for values more than 250.

Let's compare the two versions of that copying code:

Cells(rowDestination, 3) = Cells(rowSource, 1)
sales(rowDestination) = Cells(rowSource, 1)

sales is a one-dimensional array, so it only needs one number in the parens [ the ()].

Why use arrays?

Arrays are less flexible than worksheets, since all the data elements have to be the same data type. All Single, all String, whatevs.

Other programming languages, like PHP and JavaScript, let you put data of different types in the same array, like having an array that has strings in some cells, integers in other, and singles in others. VBA does not let you do that. All elements of a VBA array have to be the same data type.

Also, you can't see arrays directly. They're just in memory.

So, why use arrays?

Two reasons:

  • Every programming language you'll use has arrays, and they work more-or-less like VBAs. However, only Excel VBA has Excel worksheets available. You don't get them with VB, PHP, Ruby, Java, C++, C, C#, PHP, Python, R, or any other language. You'll use some of these languages in the future.
  • When you extract data to a worksheet, you're changing the original workbook. Often you don't want to do that. You want to keep the file just as it came in.

Summary

Arrays are a cross between variables and worksheets. They're variables, since they're in memory, like all variables. They're like worksheets, because they can have rows and columns.

Arrays are not as flexible as worksheets. In VBA arrays, every cell must be the same data type.

Every programming language you'll use has arrays, and they work more-or-less like VBAs.