Archive for March, 2010

Table Subtotals

The built-in Subtotals facility (now on the Data tab) has never been particularly useful. Inserting subtotal rows into your data rows breaks one of the commandments of Excel: Thou shalt not sully the purity of Thy Data. Not only does the data have to be sorted prior to subtotalling, you can not then re-sort or filter the data. Furthermore, with 2007, you can not use subtotals with Tables, which as I’ve mentioned previously, are rather useful.

So what are the alternatives? One way is to use an array formula as a calculated filter. Here’s an example:

As you can see, it’s an old workbook (2005), which I’ve ‘upgraded’. To get the subtotals by month, you compare the Date column with the two boundary values, and multiply with the Value column. D2 contains the following single-cell array formula:


Obviously, Table1 should have a more meaningful name. As you might already know, we can’t use AND here, as this collapses the tests to a single Boolean value, rather than a column-sized array of Booleans. The trick is to treat the Booleans as 0/1 values and multiply them. So for row 2, we’ll have (1 * 1 * 54).

The formula then Fills Right. We could equally well have the subtotals vertically, and Fill Down.

If you don’t fancy the array formulas, here’s an alternative:

Here, I’ve added a Month column, to make the filtering simpler. B2, for example, contains:

=MONTH(Table2[[#This Row],[Date]])

Each of the subtotals is a DSUM formula, using the cells above as the criteria for the filter. E3 contains:


Note that we need the [#All] accessor, otherwise the formula does not pick up the header row. Note also that the column name is a string, not a name, and is thus in quotes.

The formula Fills Right. A limitation is that the criteria ranges have to be vertical – that is, column label above criterion – so we can’t have the subtotals arranged vertically. However, for presentation purposes, we could use an array function to transpose the range from 3R x 4C to 4R x 3C:


Although we were using the DSUM variant to avoid array formulas, TRANSPOSE is easy enough to understand.


DIY Scenarios

I’ve always thought that the built-in Scenario Manager is a bit feeble. In particular, the constituent values of a scenario should be visible on a worksheet, not squirrelled away in the SM. It’s really quite straightforward to do it yourself, especially with 2007 Tables.

Here’s a very simple example. This is a ‘Model’ worksheet, showing two different scenarios:

There are three values in our scenarios, which are dropped into D2, D3, D4. These cells are named A, B, C respectively. Output (F2) is a suitably complicated formula that uses A, B and C – obviously, there could be lots of other formulas dependent on the scenario values, making up a complex model. B1 is named Scenario.

Each scenario is in a Table on a separate worksheet, with the Tables named Scenario1, Scenario2, etc:

Back on the Model worksheet, D2, D3, D4 each contain the formula:

=INDIRECT(“Scenario” & Scenario &
“[[#This Row],[value]]”)

The [#This Row] accessor requires the formula to be on the same row as the corresponding row in the scenario tables (even though on different worksheets). If this is an issue, then D2:D4 could contain the array formula:

{=INDIRECT(“Scenario” & Scenario & “[value]”)}

since this works anywhere with respect to the scenario tables.

In fact, you don’t really need tables – you could just use named ranges on the scenario worksheets – but it saves (re)defining names manually.

The Set buttons on the scenario worksheets are just a convenience, so you can inspect a scenario and then make it the current one, without having to remember which number it is. Here’s the button code for the Scenario1 button:

Private Sub SetCommand1_Click()
    Worksheets("Model").Range("Scenario").Value = _
                          Right(Me.Name, 1)
End Sub

And similarly for the other buttons. Note that the button names have to be unique within the workbook.

Excel COUNTIFS Function

My wife, Liz, is a Business Intelligence (i.e. mega-database reporting) consultant, and came up with a problem to which I offered to produce an Excel (non-VBA) solution. Since the problem is of quite wide applicability, I’ll describe it here. I also came across an Excel function that was new to me.

The problem is this. Suppose that you have a table of employee records, with Sex and Salary. For entirely proper reasons you want to find out the proportion of females in top n% of salaries. (The n% is of employees, not salary range). However, some employees just outside the n% will have salaries the same as the lowest-paid members of the n%, and so need to be included, otherwise we’d lose the former from the proportion calculation.

Here’s a simple example with 10 employees:

Obviously, with so few records, the percentage of top-earners we want to see has to be large – typically, we’d be looking at 1% to 10%. Here we want to see the female proportion for the top 25% of salaries. (The data is shown here sorted, for clarity, but does not need to be so).

25% takes us ‘half way down’ employee #3, who clearly needs to be included. But we also want to include employee #4, who earns the same. In this top-four, we have 1 female, so the proportion is 25%.

The formulas are as follows:

idcount: = ROWS(SalaryTable)

pcount: = CEILING(idcount * percentile,1)

cutoffsalary: = LARGE(SalaryTable[salary],pcount)

mcount: = COUNTIFS(SalaryTable[sex],”=m”,
SalaryTable[salary],”>=” & cutoffsalary)

fcount: = COUNTIFS(SalaryTable[sex],”=f”,
SalaryTable[salary],”>=” & cutoffsalary)

fpercent: = fcount / (fcount + mcount)

The COUNTIFS function was new in 2007, and is new to me. You can have up to 127 range/criterion pairs. The ranges must have the same dimensions (i.e. column length, here). The criteria are evaluated on a cell-by-cell basis (i.e. by row, here), and ANDed together.

Note also the use of the LARGE function, to get the ith largest salary, for i = pcount.

I was rolling up my sleeves for some array formulas, but it turned out to be quite straightforward.

Dynamic Combo Box List

A combo box on a form can pick up the list contents from its RowSource property. This can be the name of a worksheet range – such as “products” (not ever a literal address, please).

However, this is tricky if the size of the list changes (if the list grows, specifically). With a standard defined Name, you need to keep on redefining the Name. Alternatively, you can use the OFFSET-function trick for dynamic ranges (but that is obscure and depends on things like the number of non-data cells in the column).

In 2007 we have the Structured Reference naming into Tables, such as “ProductTable[Product]”, which returns us the data contents of the product column. On a recent course, I was asked if we could use this as a Combo box RowSource property. The good news is that we can:

Any additions to the Table are picked up when the form reopens:

You could get this to work for a multi-column list (e.g. id + name, with the id as BoundColumn). For a 2-column table, you can specify the whole data area: “ProductTable[#data]”. It doesn’t matter if the #data area has more columns than the list needs: the leftmost ones are used.

Initially, I thought I’d try this with an ActiveX combo box dropped directly on to the worksheet. However, such on-sheet controls are actually OLEObjects. Rather than a RowSource property, they have a ListFillRange. An OLEObject could be things other than a List/Combo box, in which case this property would be inapplicable. (Similarly, RowSource is actually defined on the Control class).

That’s okay, but unfortunately, OLEObject.ListFillRange does not understand structured references into tables: it needs either a regular Defined Name or (don’t) a literal address. This seems like an oversight: the OLEObject‘s Parent is the Worksheet, which holds the Table as a ListObject – so it should be possible to resolve the name.

Nevertheless, it works fine for the Form combo boxes, and I’ll try to use this technique in the future.

March 2010
« Feb   Apr »