In Part 2, we had some VBA code that exported data from financial-report workbooks into XML files. What we can do now is query these files to produce summary tables – rather like views of a pivot table – and include these in a PDF or XHTML report, to provide an ‘executive summary’. (Note: we’re leaving VBA behind here, but stay tuned).
Here’s what our exported XML data looks like:
<?xml version="1.0" standalone="yes"?>
<bureport xmlns:xsi='http://www.w3.org/...>
<buname>Middleware</buname>
<period>4Q2009</period>
<product>
<productname>Connectomatic</productname>
<productsales>456</productsales>
<productcosts>65</productcosts>
</product>
<product>
<productname>Netotron</productname>
<productsales>43</productsales>
<productcosts>34</productcosts>
</product>
<totals>
<totalsales>499</totalsales>
<totalcosts>99</totalcosts>
</totals>
</bureport>
And here’s the kind of report we might generate (this is XHTML):
The idea is that this is a presentable/printable report. We could have a lot more in terms of sections, supporting text, fancy layout, etc.
So how do we get from the XML data to the summary report? Here’s one possibility:
- import the XML files into an XML database
- run queries over the database to generate summary tables as individual DITA topics
- include (content reference) the summary tables from our master-report topic, which has the overall structure and additional content
- generate the output report in the desired format(s).
Sounds a bit of a performance – why don’t we just collate all our data in a ‘database’ Excel workbook, knock up a few pivot tables, and copy-paste these into a Word document? Basically, flexibility. Firstly, a pivot table isn’t a general query mechanism: we might want all sorts of summary information, not just dimension-aggregations. Secondly, the individual summary tables might be included in different combinations in different reports. Also, we want the reports to reference fixed locations, where they can pick up the latest (e.g. last month’s) summary tables automatically.
For the XML database, I’m using XMLMind’s QizX Free Engine. You can use this either from a command-line or via a simple client app. There’s an XQuery query to generate each table (they’re very similar): here‘s the one for the totals by Business Unit (BU).
The summary tables are minimal DITA topics containing a <simpletable>:
<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE topic ...>
<topic id="FR_BU_Totals_Data">
<title>BU Totals Data</title>
<body>
<simpletable id="BU_Totals_Data">
<sthead>
<stentry>BU</stentry>
<stentry>Sales</stentry>
<stentry>Costs</stentry>
</sthead>
<strow>
<stentry>Applications</stentry>
<stentry>4707</stentry>
<stentry>2325</stentry>
</strow>
<strow>
<stentry>Middleware</stentry>
<stentry>1103</stentry>
<stentry>421</stentry>
</strow>
</simpletable>
</body>
</topic>
The “FR” prefix on the topic id stands for “Financial Report”. Note the id also on the <simpletable>.
Now, in the DITA topic for our master report, we just have a <simpletable> element that con(tent)ref(erence)s the summary table in its separate topic:
<simpletable conref="..."/>
where the “…” is a reference of the form: filepath#topicid/elementid.
Now we just generate the report output in our chosen format. Multiple reports can be built into a larger structure: a set of XHTML pages with appropriate linking, or a multi-section PDF document, both with tables of contents, etc.
Stages #1, #2 and #4 above can all be done by command-line tools, so it should be possible to automate the entire process (including the VBA bit).
I think that deserves a cup of tea…


