I’ve long been a bit of a fan of the Ruby language (yes, there is more to life than VBA). I’m currently reading Metaprogramming Ruby, from the excellent Pragmatic Programmers. This is all about taking a very dynamic, data-driven view of programs, customising or extending class/object capabilities at run-time. The language design owes a lot to Smalltalk, which followed the everything-is-an-object philosophy (so classes are objects, and therefore need their own classes: metaclasses).
Anyway, in the Ruby community there’s interest in Domain Specific Languages: problem-specific ‘dialects’ of Ruby that offer higher level expressions for particular domains. Which led me to wonder what a DSL would be like in Excel/VBA.
Now, Excel has a sort of DSL in its formula language. It’s specific to the domain of functional calculation and lookup, with built-in functions for various categories. Some categories, like the financial functions, are focused on a particular domain; other categories are more general.
With VBA, we can clearly add custom functions relevant to a particular domain: financial, engineering, whatever. However, a formula typically produces a single value and puts it in a single cell, so the effect is quite small scale. While we can have functions that return arrays, these can be unwieldy, and can be very dependent on worksheet layout. Also, there are manageability limits to the complexity of formulas. So it could be difficult to use formulas alone to build and manipulate complex structures or models.
For example, there are numerous problems that require a node-link graph structure: workflow, scheduling, resource tracking, etc. Although such a structure could be represented as data on a worksheet, it would be quite cumbersome to manipulate it in this form.
An alternative approach would be treat the worksheet data as instructions, which are interpreted to build an internal (VBA) object model. Other types of instruction would be commands to manipulate the model, and queries of it in various ways. This implies that we have some sort of ‘worksheet interpreter’ (in which we might guess that the Evaluate and Run procedures will come in handy).
The example I’ve been looking at so far is what you might call ‘payroll type’ processing. There is a set of entities (employees, say), each of which needs to be processed, in order to determine a final value (their monthly salary payment). There is a set (sequence, really) of rules, which need to be applied to each entity, possibly modifying the value. For example: if the employee grade > 3 and they’ve worked more than 1 year and it’s December, pay them a bonus of £100. Handling all the exceptions and modifications is a classic problem for payroll systems.
Now, it’s not impossible to do this with formulas alone, but you’d be knee-deep in IF() functions. The intermediate values would be visible (presumably in columns, with the employees in rows), leading to a humungous worksheet. Another issue is that this type of processing needs to be done at a specific date/time (so we can work out things like length-of-service). We don’t want the calculations to be dynamic; we want a nice big ‘Run’ button.
I’ll describe the solution so far in the next posting.