In Part 1, I was toying with the idea of worksheets that are hierarchically related. In this example, the hierarchy reflects geographical areas: Country, Area, City. Data is held by the leaf-node worksheets, and aggregated by the higher-level worksheets. Using the INDIRECT function means that the aggregation formulas are generic, and so can be used as-is from a template worksheet.
This is clearly pivot table territory, so what’s the difference with this approach?
Firstly, the scenario I have in mind is where you are collating worksheets sent to you in separate workbooks (for example, the London office sends you London.xlsx, containing a single London worksheet). You have already developed a bit of VBA to copy each of the worksheets into a single collated workbook (UK.xlsx, say). However, the incoming worksheets are effectively reports, and might contain information in addition to the raw Sales and Costs figures. So I can’t immediately collate the Sales/Costs information into a single table: if I need to, that’s a subsequent step (see later).
Secondly, each leaf-node worksheet only needs to know about its parent node. For example, London knows it’s in the South area, but not about its place in the overall hierarchy. Similarly, a non-leaf node knows only about its parent and its children. This makes it easy to change the hierarchy – for example, breaking London into Central London and Outer London.
In contrast, if the basic data were in table form, the levels of the hierarchy would need to be explicit, as columns:
If I decided to break London into Central and Outer zones, then I’d need an additional ‘Zone’ column, with values (a lot ‘n/a’) for each record.
However, the tabular data is clearly useful for generating pivot table summaries – here’s one at the Area level:
or with multiple fields on the Row axis, to get a hierarchical summary:
So, I’m thinking that it would be useful have some VBA to do the second stage of collation, of the leaf-node Sales/Costs data into a single table. This sounds like it will involve a tree walk around the worksheet hierarchy. Perhaps in two passes: once to find out the maximum depth (and thus the columns of the table), and again to copy the data into the table.