Let's look at a practical use of arrays. Lots of software can save data in Excel format, including this Web site. Instructors can download data on student work, in Excel format. Accounting software, inventory programs, HR software… many can save data in Excel format.
Here's Diana.
Diana and you work for the same company. Diana is in HR, you are in IS. Part of Diana's job is to keep the employee database up-to-date. Add new employees, update benefits plans, like that. She also analyses that data for her boss.
HR is trying out WOMBAT, a new data analysis program that's still in beta (that is, still being tested). Diana wants to export data from the HR database into WOMBAT. The HR DB can save data in Excel, and WOMBAT is supposed to read Excel files. Here's Diana's plan:
Diana starts the HR DB program, goes to the export screen, and chooses the fields she wants:
She choose EID (eployee id), last name, status (full-time or part-time), and health insurance plan (a code for the plan, e.g., "BC/P"). SHe clicks the Export to Excel button, and saves the workbook created by the HR DB. Diana opens it to make sure the data is there:
Hooray! So far, so good.
Diana starts WOMBAT, and tries to import the workbook:
You – to the rescue!
Your boss asks you to help. You contact the WOMBAT people, and they tell you that, yes, it won't import Excel. But it will import CVS (comma-separated values) files. Some CSV data:
1000, 35
1001, 20
1002, 38
1003, 32
1004, 32
1005, 32
1006, 22
It's just values, separated by commas. A simple, widely used format.
"No problem," you say to Diana. "We'll just save the data in CSV. Excel can do that."
"Great!" she says. "Just one thing…"
(Uh oh.)
"Could you get rid of the records for part-timer employees with the NH1 plan? The HR DB can't filter them out of the export. And WOMBAT can't remove them, either."
(Sigh.)
"OK," you reply. "I'll write some VBA."
So you do.
Excel to array to CSV file
How do you want to thunk about this?
I'm thinking maybe two chunks:
- Read the data into an array, filtering out the records we don't want.
- Write the array to a file.
What do you think?
We've only read data for one column. There are four!
Let's get started with the main program. I'll add a Run button… there. Now I'll just start with this:
Private Sub cmdRun_Click()
'Read data from worksheet into arrays.
'Write CSV data file from arrays.
End Sub
OK, that's a start.
W00t for Lily! She's starting with the big chunks (this is often called top-down programming). She's just putting in the comments, too. A Good Thing.
Private Sub cmdRun_Click()
'Read data from worksheet into arrays.
readFilterData eids, lastNames, employmentStatuses, plans
'Write CSV data file from arrays.
End Sub
How does that look?
So, I guess that means four counts, one for each column?
Private Sub cmdRun_Click()
'Read data from worksheet into arrays.
readFilterData eids, lastNames, employmentStatuses, plans, >>numRows<<
'Write CSV data file from arrays.
End Sub
That look's good. But let's declare the variables, and add the second call to the main program.
(Typity, typity)
Private Sub cmdRun_Click()
Dim eids(500) As String
Dim lastNames(500) As String
Dim employmentStatuses(500) As String
Dim plans(500) As String
'Read data from worksheet into arrays.
readFilterData eids, lastNames, employmentStatuses, plans, numRows
'Write CSV data file from arrays.
writeCsvFile eids, lastNames, employmentStatuses, plans, numRows
End Sub
I'm guessing at the 500. We can always change it, if it isn't enough.
How's that look?
eids
an array of strings? They're numbers. Long
rather than Integer
. They need to be Long
, because they're more than 32,767. How about adding stubs and comments for the Subs? Help keep us on track.
Private Sub cmdRun_Click()
Dim eids(500) As String
Dim lastNames(500) As String
Dim employmentStatuses(500) As String
Dim plans(500) As String
'Read data from worksheet into arrays.
readFilterData eids, lastNames, employmentStatuses, plans, numRows
'Write CSV data file from arrays.
writeCsvFile eids, lastNames, employmentStatuses, plans, numRows
End Sub
'Read data from the worksheet into arrays, filtering as we go.
'Params:
' eids: array of employee ids (output).
' lastNames: array of last names (output).
' employmentStatuses: array of full-time/part-time statuses (output).
' plans: array of health plan codes (output).
' numRows: number of rows read (output).
Sub readFilterData(eids, lastNames, employmentStatuses, plans, numRows As Integer)
End Sub
'Write data to CSV file.
'Params:
' eids: array of employee ids (input).
' lastNames: array of last names (input).
' employmentStatuses: array of full-time/part-time statuses (input).
' plans: array of health plan codes (input).
' numRows: number of rows read (input).
Sub writeCsvFile(eids, lastNames, employmentStatuses, plans, numRows As Integer)
End Sub
Well, how's that?
Now they have the complete top level of the program. Huzzah!
Sub readFilterData(eids, lastNames, ... numRows As Integer)
Tara, how do you do it?
Sub readFilterData(eids() As String, lastNames() As String, ... numRows As Integer)
The empty parens tell VBA that you're passing arrays.
Either way is fine.
The input sub
'Read data from the worksheet into arrays, filtering as we go.
'Params:
' eids: array of employee ids (output).
' lastNames: array of last names (output).
' employmentStatuses: array of full-time/part-time statuses (output).
' plans: array of health plan codes (output).
' numRows: number of rows read (output).
Sub readFilterData(eids() As String, lastNames() As String, _
employmentStatuses() As String, plans() As String, numRows As Integer)
End Sub
Now what? Hmm… you know, we just saw how to read data from a worksheet into an array, in the last lesson. I'm going to grab some code from there, and paste it here. Then adjust it.
'Read data from the worksheet into arrays, filtering as we go.
'Params:
' eids: array of employee ids (output).
' lastNames: array of last names (output).
' employmentStatuses: array of full-time/part-time statuses (output).
' plans: array of health plan codes (output).
' numRows: number of rows read (output).
Sub readFilterData(eids() As String, lastNames() As String, _
employmentStatuses() As String, plans() As String, numRows As Integer)
Dim rowSource As Integer
Dim rowDestination As Integer
rowSource = 0
rowDestination = 0
Do While Cells(rowSource + 1, 1) <> ""
'Is the data above 250?
If Cells(rowSource + 1, 1) > 250 Then
'Copy value to destination.
rowDestination = rowDestination + 1
sales(rowDestination) = Cells(rowSource + 1, 1)
End If
rowSource = rowSource + 1
Loop
End Sub
Lil, do you remember how this works?
rowSource
is a counter, tracking which row in the data source (the worksheet) we're getting data from.
rowDestination
is a counter as well, but it tracks where we put data in the destination (the array).
The While
loops until it finds an empty cell. If the data in a cell passes the filter (the If
statement), then it gets put into the array.
This should be easy to change. How about this:
'Read data from the worksheet into arrays, filtering as we go.
'Params:
' eids: array of employee ids (output).
' lastNames: array of last names (output).
' employmentStatuses: array of full-time/part-time statuses (output).
' plans: array of health plan codes (output).
' numRows: number of rows read (output).
Sub readFilterData(eids() As String, lastNames() As String, _
employmentStatuses() As String, plans() As String, numRows As Integer)
Dim rowSource As Integer
Dim rowDestination As Integer
Dim eid As String
Dim lastName As String
Dim employmentStatus As String
Dim plan As String
'Init row counters.
rowSource = 0
rowDestination = 0
Do While Cells(rowSource + 1, 1) <> ""
rowSource = rowSource + 1
'Get a row's data.
eid = Cells(rowSource, 1)
lastName = Cells(rowSource, 2)
employmentStatus = Cells(rowSource, 3)
plan = Cells(rowSource, 4)
'Save this one?
If Not (employmentStatus = "p" And plan = "NH1") Then
'Copy value to destination.
rowDestination = rowDestination + 1
eids(rowDestination) = eid
lastNames(rowDestination) = lastName
employmentStatuses(rowDestination) = employmentStatus
plans(rowDestination) = plan
End If
Loop
numRows = rowDestination
End Sub
I added variables like eid
and lastName
to put the data in each row into. Made it easier to think about what the code does. Changed the filter test. Used all four arrays. Not much else changed.
This is what programmers do. They take something that works, and adapt it.
The output sub
For
loop. Something like this:
'Write data to CSV file.
'Params:
' eids: array of employee ids (input).
' lastNames: array of last names (input).
' employmentStatuses: array of full-time/part-time statuses (input).
' plans: array of health plan codes (input).
' numRows: number of rows read (input).
Sub writeCsvFile(eids() As String, lastNames() As String, _
employmentStatuses() As String, plans() As String, numRows As Integer)
Dim index As Integer
Open ThisWorkbook.Path & "\hr-db.csv" For Output As #1
For index = 1 To numRows
>>Write out eids(index), lastNames(index), employmentStatuses(index), plans(index)<<<
Next index
Close #1
End Sub
But how do you output data in the CSV format, with all those commas? Seems like there should be an easier way.
Tara, is there?
Write
statement. That's what it does.
Oh, man! That makes it so easy!
'Write data to CSV file.
'Params:
' eids: array of employee ids (input).
' lastNames: array of last names (input).
' employmentStatuses: array of full-time/part-time statuses (input).
' plans: array of health plan codes (input).
' numRows: number of rows read (input).
Sub writeCsvFile(eids() As String, lastNames() As String, _
employmentStatuses() As String, plans() As String, numRows As Integer)
Dim index As Integer
Open ThisWorkbook.Path & "\hr-db.csv" For Output As #1
For index = 1 To numRows
>>Write #1, eids(index), lastNames(index), employmentStatuses(index), plans(index)<<
Next index
Close #1
End Sub
Write
rocks CSV files!