Archive for February, 2009

Division of Labour

One of the good things about using VBA with Excel is that you get two languages to play with: Excel formulas and VBA. We can use these in a complementary way, to effect a ‘division of labour’. Specifically, there are many built-in Excel functions, which it would be pointless (and less efficient) to re-implement in VBA.

Suppose that we need a VBA function to calculate an Order Price, as the back-end to a data-entry form:

Function Order_Price(productid As String, quantity As Long) _
                                              As Currency
        'pre: productid is in Catalogue
        'pre: quantity > 0
    ...

To calculate the price, we need to look up the Unit Price (UP) for the given productid (PID), in a Catalogue that is held on a worksheet.

Catalogue

Catalogue

There is already a formula function to do just this: VLOOKUP. The lookup value and the VLOOKUP formula go in two named cells, PID and UP respectively:

Lookup

Lookup

Now, all the VBA function needs to do is set the value of the PID cell, let the VLOOKUP do its work, and then get the value out of the UP:

    ...
    Dim unitprice As Currency

    Range("PID").Value = productid
    unitprice = Range("UP").Value

    Order_Price = unitprice * qty
                 'could apply delivery, discounts, etc
End Function

Notice that the Range function is used with names, not cell addresses. Firstly, this makes the code more readable. More importantly, it makes the code much more robust, since any structural changes to the worksheet are taken care of by automatic re-definition of the names “PID” and “UP”. Having literal cell addresses (“E2″, “F2″) in our code would be a maintenance nightmare. Anyway, you name ranges in Excel as standard practice, don’t you?

Note also that Value is not the default member of Range (there isn’t one), unlike form controls. So we can’t write just ‘Range(“UP”)’, we need the ‘.Value’.

You might wonder whether the VLOOKUP formula is guaranteed to evaluate before VBA moves on to the ‘unitprice = …’ statement. However, it always seems to work in practice, so my guess is that the worksheet calculations take place before control returns to the VBA.

Using Comments in Code (and DbC)

There is a lot of VBA code out there that is devoid of comments. Partly, this is due the lightweight development culture of VBA. You’re just ‘knocking something together quickly’, so you won’t bother with comments, maybe later. We’re all guilty of this to some extent (oh, sorry, you’re the one person who always comments rigorously…). So I think it helps to have a clear idea of what you use comments for.

First the negative cases (what not to use comments for):

  • The plonkingly obvious: paraphrases of code that should be clear enough on its own (i = i + 1     ‘increment counter)
  • Comments to describe sections of a procedure that is too long, and should be broken up into separate procedures.

Now the positive cases:

  • Module header comments: what a module is about; a list of its Public procedures (i.e. its interface)
  • Procedure pre-conditions and post-conditions: more on these below
  • On Dim statements, to expand on the local variable names (I tend to favour shortish names)
  • Working notes during development. e.g. “No error handling yet”, “No validation of user input”
  • Commenting out blocks of code (though these aren’t meaningful comments).

I think they are all fairly obvious, apart from the second one. This is based on the idea of Design by Contract (DbC), advocated by Bertrand Meyer, and implemented in the Eiffel language.

The idea of a pre-condition is that a procedure states what it assumes to be true at the time it is called. Often, the pre-condition will concern the procedure’s arguments. For example:

  • an integer is positive, or negative, or within a limit
  • a date is in the future, or the past, or within a period
  • a string is non-empty, or a certain length, or numeric
  • an object reference is valid (not ‘nothing’).

A header comment in the procedure is the place to state such pre-conditions. What this does is push the responsibility for ensuring the pre-conditions on to the caller(s) of the procedure. There is thus a clear division of labour between caller and called procedure.

For example, suppose an event-handler in a Form calls a support procedure in a general module, passing some user input (a product Id, say). For the support procedure to do its stuff, it requires that the user input is valid. Stating this in a pre-condition makes it clear that it’s the Form’s job to ensure this. In practice, the Form ensures that the input is valid, by some means: e.g. checking values in the OKbutton_click handler, or using a dropdown list populated with the set of valid values. A key point is that the pre-condition means that the called procedure does not need to check or validate the condition, which makes the code simpler.

The complement of a procedure’s pre-condition is its post-condition: what the procedure ensures is the case when it has finished execution. This is particularly useful for Functions, where the post-condition relates to the returned value. Again, it might be that the result is not zero/empty/nothing. The caller can then rely on this being the case, and does not need to check it, making its code simpler.

Here’s an example:

Function OrderPrice (productid As String, _
             quantity As Integer) As Currency
    'pre: productid is valid and product.unit_price > 0
    'pre: quantity > 0
    'post: OrderPrice > 0

    'clever stuff with discounts and delivery charges ...
End Function

Post-conditions can also refer to:

  • changes made to the document/workbook/etc, such as adding or deleting elements
  • the result of searches (what happens if the search fails? is there a default?)
  • files written to disk, or other external changes
  • and so on.

This might seem a bit of a performance, but it’s really just a mindset, in which you think about what procedures are supposed to be doing (that is, their specification), not just as arbitrary lumps of code. Ok, I think I’ll get off my hobby horse now…

Creating a Directory Structure

I recently wanted to define a fairly complex directory structure, and be able to reproduce it in different locations. It seemed natural to define the structure as an indented list in an Excel worksheet, like this:

Directory structure in a worksheet

Directory structure in a worksheet

Obviously, in a real case, these names would be meaningful.

What we want is to ‘walk’ this tree structure, building up paths like these:

/A
/A/B1
/A/B1/C1
/A/B1/C2
/A/B2
/A/B2/C3
/A/B2/C4
/A/B2/C4/D1
...

and create the corresponding directories.

The way we walk the tree is ‘depth first': keep going down-and-right while there are filled cells; when we get to an empty cell, go left until we find a filled cell:

How we walk the tree

How we walk the tree

We’ll do this by changing row and column indexes. Cell A1 is (0,0). When we get back to column 0, then we’re done.

Here’s the main procedure:

Sub TreeWalkVal()
        'Tree walker that generates chained name paths
        'e.g. /foo/bar/baz
        'The first name is that of the root node.
        'Assumes that the tree is on Worksheets(1).

    Dim r As Integer, c As Integer      'row, column indexes
    Dim home As Range
    Dim rng As Range                    'the current cell
    Dim done As Boolean
    Dim path As New List

    r = 0
    c = 0
    done = False

    Worksheets(1).Activate
    Set home = ActiveSheet.Range("A1")
    Set rng = home
    path.Add (rng.Value)            'Root

    Do Until done

        Call NodeAction(r, c, rng.Value, path)

        r = r + 1
        c = c + 1
        Set rng = home.Offset(r, c)       'walk down-right
        path.Add (rng.Value)

        Do While IsEmpty(rng) And c > 0   'walk left
            c = c - 1
            Set rng = home.Offset(r, c)
            path.Remove                   'last element of path
            If Not path.IsEmpty Then _
                      path.SetLast (rng.Value)
        Loop

        If c = 0 Then done = True

    Loop

End Sub

The path is held as a List of elements. List is a class (defined in a Class Module): I’ll talk about this in another post. With our List, we can add and remove elements to/from the tail of the List.

NodeAction is broken out into a separate procedure, because the tree-walking is quite generic, and we might want to do different actions:

Sub NodeAction(r As Integer, c As Integer, _
                    val As String, path As List)
        'Action to be performed for each node.
        'In this case a directory is created
        'for the specified path.
        'An existing directory at path throws an error,
        'which is ignored.
        'There is no directory deletion.

    On Error Resume Next

    Dim pathstring As String
    pathstring = path.ToString

    Debug.Print val; " : "; pathstring

    If Not path.IsEmpty Then _
            MkDir (ActiveWorkbook.path + pathstring)

End Sub

List defines a constant separator (in this case, “/”), which is used by the List.ToString method.

The On Error Resume Next ignores the ‘directory already exists’ error.

The TreeWalkVal procedure works with incremental additions to the structure, but does not handle deletions (an exercise left for the reader, as they say).

An alternative procedure is TreeWalkNum, which generates paths with positional numbering:

/1
/1/1
/1/2
/2
/2/1
/2/2
/2/2/1
/2/3
...

This could be useful for generating nested sections in a document (where these are numbered explicitly, not auto-numbered).

Excel Timestamp

If you want a timestamp in an Excel workbook, you can use the NOW() function (or TODAY() if you don’t want the time part). However, these functions are ‘volatile’, meaning that they recalculate themselves whenever any change is made to the workbook. This isn’t usually what we want: what we typically want is a calculate-once funcion.

If we want a single timestamp for creation of a workbook, we could have a =NOW() formula in the workbook template, and then on opening a new workbook do a Copy-Paste Special Values for that cell. In the ThisWorkbook module, we could have:

Private Sub Workbook_Open()
    Range("Timestamp").Select
    Selection.Copy
    Selection.PasteSpecial Paste:=xlValues, _
                Operation:=xlNone, SkipBlanks:=False, _
                Transpose:=False
    Application.CutCopyMode = False
End Sub

However, there’s a simpler way: define a function that can be used in a formula, which uses the VBA Now() function, rather than the Excel NOW() function. In a public module of the template or your PERSONAL.xls workbook define:

Public Function TIMESTAMP() As Date
    TIMESTAMP = Now
End Function

Now, where we want a timestamp, we can simply enter the formula =TIMESTAMP(). This is calculated when the formula is entered from edit mode, or when it is filled into a cell (e.g. filled down).

We can timestamp data values as they are entered, say in column A, by having this formula in column B:

=IF(NOT(ISBLANK(A2)),PERSONAL.XLS!TIMESTAMP(),””)

That’s an empty string as the ‘else’ argument.

A timestamp even disappears if we delete a data value from column A, and re-appears with an updated value if a data value is re-entered.

Constants in Code (Part 2)

In the previous post, we moved literal constant values into a separate module, together with ‘constructor’ functions for building values.  However, the scope of these constant and function declarations is the project: in other words a single template. So how can we share these declarations across multiple templates?

What we do is create a template (in our standard Templates directory), containing just the constant module(s): let’s call it Constants.dot. By default the template project will be called ‘Project’. We need to change this to some meaningful name, to avoid name clashes: in the Project Properties, let’s change this to ‘PConstants’.

Now comes the clever part: we can reference a template as a library, just as we reference the Word library. Within each template that uses the constants, go into >Tools >References…, browse to the Constants.dot file, and OK. Now whenever we open the client template it loads PConstants, with whatever declarations are currently in it.

Incidentally, you might notice that the PConstants project is unviewable: you can’t see its contents. You can also see that Constants.dot is loaded as a template via >Tools >Templates and Add-ins… in the Word application.

So now we have a single master copy of our constant declarations, which can be loaded by any client template that requires these. Any changes made to the master copy are propagated to the client templates.

Although we’ve described this in the context of Word, the same applies for the other applications. In the case of Excel, the constant module would live in an Add-in (.xla file), rather than a template.


February 2009
M T W T F S S
« Jan   Mar »
 1
2345678
9101112131415
16171819202122
232425262728  

Follow

Get every new post delivered to your Inbox.