Archive for October, 2010

Code Quality

An interesting essay on Code Quality by Hew Wolff, over at the Pragmatic Programmers, in their online magazine. The basic thesis is that ‘Good Code Tells the Truth’ (that’s the title). It resonated with me because a week or two back I was looking at an Access/VBA application (not one of mine!). The code in this application was, if I might be frank, dire:

• no indentation in a lot of procedures
• no splitting of long SQL strings over multiple lines (when I did reformat one, it ran to 8 lines)
• no useful comments at either module/form or procedure level
• form controls not renamed (one form had buttons called Command27, Command28, Command66, Command139 – see what I mean?)
• obscure variable names – for example, in one procedure, U3, D3, d4, SN3, FN3
• using 0 and -1 for False and True (why?)
• copy-paste-and-tweak repetition (e.g. to re-initialize a form).

The piece de resistance was a SELECT-INTO query that created a column using this expression:

[id]-[id]-1

So, arithmetic on Ids? As far as I can tell, this converts an Id value into -1 (presumably treated as True), but leaves Null values as Null, since arithmetic expressions involving Nulls equal Null. Is this a common Access programming technique? Even if it is, couldn’t it be wrapped in a function called something like NonNullToTrue (although this could not have argument or result types more specific than Variant).

This might (or might not) be a clever trick, but not considering the possibility of other people finding it confusing indicates of a poor approach to programming. What the essay mentioned earlier is saying, and what I’m agreeing with here, is that software development is as much about communication of a solution as it is about that solution just working.

Pivot Table Calculated Column – 2

Thanks to Dick Kusleika and Andy Pope for offering solutions to my pivot table problem: how to have a column that shows the difference between Max and Min aggregations (or other ‘custom’ aggregation).

Dick’s solution is to have an additional column in the source data: effectively calculating Max, Min and MaxMinDiff before aggregation in the pivot table (so the Max and Min are not done in the pivot table).

I have actually added three columns, to break the formula down: the Max and Min columns are just intermediates. The Max and Min values are within Month: since the records are sorted by Month, this can be seen easily.

Here’s the pivot table:

In the pivot table, the Max and Min columns are normal aggregations of the Sales column in the data: they are not derived from the Max and Min columns in the data.

However, the MaxMinDiff column in the pivot table is derived from the Diff column in the data. Since all the values underlying a MaxMinDiff cell have the same value (e.g. Data H2:H6 underlies Pivot Table D3), we can use any of the non-summing aggregation functions: Max, Min, even Average. I changed the lable to MaxMinDiff, because “Max of Diff” looks weird.

Now the formulas in the additional data columns. The Max and Min formulas are single-cell array formulas:

{=MAX([Sales] * ([Month] = SalesTable[[#This Row],[Month]]))}

{=MIN(IF([Month] = SalesTable[[#This Row],[Month]],[Sales],””))}

The MAX formula is multiplying the Sales column by an array of Booleans, where the latter come from testing the Month column against this row’s Month. I do like the table-range addressing, although I don’t see why a formula within SalesTable needs to refer to SalesTable. The resulting array contains Sales figures for a specific month, and zeroes for the other months; this is then MAXed.

The MIN formula is slightly trickier, because we can not leave the false-case zeroes, as these would of course be the minimum value. However, MIN of a number and an empty value is the number, so “” is substituted for false/zero.

The Diff column formula is then just a simple difference:

=SalesTable[[#This Row],[Max]]-SalesTable[[#This Row],[Min]]

(Again, I don’t see why the syntax requires the reference to SalesTable).

This all works fine, provided that Month is the only visible dimension in our pivot table. This is because the data Max and Min values are within Month, as is obvious from the formulas. However, we could equally well want to see MaxMinDiff for values within Region or within Sales Rep. This would necessitate further columns, or changing the formulas to test a different dimension (e.g. Region).

Andy Pope suggested a VBA solution, handling the Worksheet_PivotTableUpdate event:

```Private Sub Worksheet_PivotTableUpdate(ByVal Target As PivotTable)

With Target.DataBodyRange
With .Columns(.Columns.Count)
.Offset(-1, 1).Resize(1, 1) = "MaxMinDiff"
.Offset(, 1).FormulaR1C1 = "=RC[-2]-RC[-1]"
End With
End With
End Sub
```

This certainly inserts a column of formulas, but it is next to, rather than part of the pivot table. Consequently, it does not adapt to changes in the pivot table layout. Also, it requires the Max and Min columns to be in the pivot table (I guess you could live with that).

On a completely different tack, a PivotCache can be based on a Recordset (typically sourced by a SELECT query on some data). I wonder if it’s possible to manipulate such a Recordset to add columns…