My wife, Liz, is a Business Intelligence (i.e. mega-database reporting) consultant, and came up with a problem to which I offered to produce an Excel (non-VBA) solution. Since the problem is of quite wide applicability, I’ll describe it here. I also came across an Excel function that was new to me.

The problem is this. Suppose that you have a table of employee records, with Sex and Salary. For entirely proper reasons you want to find out the proportion of females in top n% of salaries. (The n% is of employees, not salary range). However, some employees just outside the n% will have salaries the same as the lowest-paid members of the n%, and so need to be included, otherwise we’d lose the former from the proportion calculation.

Here’s a simple example with 10 employees:

Obviously, with so few records, the percentage of top-earners we want to see has to be large – typically, we’d be looking at 1% to 10%. Here we want to see the female proportion for the top 25% of salaries. (The data is shown here sorted, for clarity, but does not need to be so).

25% takes us ‘half way down’ employee #3, who clearly needs to be included. But we also want to include employee #4, who earns the same. In this top-four, we have 1 female, so the proportion is 25%.

The formulas are as follows:

*idcount*: = ROWS(SalaryTable)

*pcount*: = CEILING(idcount * percentile,1)

*cutoffsalary*: = LARGE(SalaryTable[salary],pcount)

*mcount*: = COUNTIFS(SalaryTable[sex],”=m”,

SalaryTable[salary],”>=” & cutoffsalary)

*fcount*: = COUNTIFS(SalaryTable[sex],”=f”,

SalaryTable[salary],”>=” & cutoffsalary)

*fpercent*: = fcount / (fcount + mcount)

The COUNTIFS function was new in 2007, and is new to me. You can have up to 127 range/criterion pairs. The ranges must have the same dimensions (i.e. column length, here). The criteria are evaluated on a cell-by-cell basis (i.e. by row, here), and ANDed together.

Note also the use of the LARGE function, to get the *i*th largest salary, for *i* = *pcount*.

I was rolling up my sleeves for some array formulas, but it turned out to be quite straightforward.

### Like this:

Like Loading...

*Related*

Here what I would have done, in 2 cells

Find the cut off level XX%

Sum product X*1, Y*1

Ross: you’ll have to explain your suggestion in more detail.

The various formulas in my example could of course be rolled up into a single formula in a single cell.

Very interesting article, thanks. As I’m limited at my current workplace to using Excel2003 I’d probably do this using pivot tables, but those new functions in 2007 look very useful. If anything the LARGE function is the one that catches my eye here.