Pages

Search This Blog

Thursday, 25 August 2011

Finding Nth largest salary in a table

Let’s say this is your data:
Name Salary
KING 5000
FORD 3000
SCOTT 3000
JONES 2975
BLAKE 2850
CLARK 2850
ALLEN 1600

Who is second – FORD or SCOTT or both?

What will you say about JONES’s salary – is it the 3rd highest salary, or the 4th highest?

If you are looking for the set of people earning the Nth highest salary, with no gaps in case of ties, then JONES should be ranked 3rd, after KING [5000, 1st], followed by FORD and SCOTT [both 3000, 2nd].

If you are looking for exact ranks with gaps if there are ties, then JONES is the 4th highest paid employee, as there are 3 people earning more than him – KING, FORD and SCOTT. In this system of ranking, FORD and SCOTT are 2nd jointly and no employee is 3rd.

This is how your ranks will look, in the 2 cases:
Name Salary Rank
KING 5000 1
FORD 3000 2
SCOTT 3000 2
JONES 2975 3
BLAKE 2850 4
CLARK 2850 4
ALLEN 1600 5
Name Salary Rank
KING 5000 1
FORD 3000 2
SCOTT 3000 2
JONES 2975 4
BLAKE 2850 5
CLARK 2850 5
ALLEN 1600 7
        Scenario 1: No gaps in case of ties                                                             Scenario 2: Gaps in case of ties

Once you have your question sorted out -

(a) Set of people earning the Nth highest salary, with continuous ranks if there are ties, OR

(b) Set of people earning the Nth highest salary, with skipped rank numbers if there are ties
Then you can proceed to writing the queries.

Scenario 1: DENSE_RANK () for Nth highest row, no gaps in case of ties

The analytic function dense_rank() will rank the rows with no gaps in ranking sequence if there are ties.

The ranks are calculated as:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
SQL> select ename
2        ,sal
3        ,dense_rank() over (order by sal desc) ranking
from   emp;
 
ENAME             SAL    RANKING
---------- ---------- ----------
KING             5000          1
FORD             3000          2
SCOTT            3000          2
JONES            2975          3
CLARK            2850          4
BLAKE            2850          4
ALLEN            1600          5
Wrap a filter around and pick out the Nth highest salary, say the 4th highest salary.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
SQL> select *
from
3  (
4    select ename
5          ,sal
6          ,dense_rank() over (order by sal desc) ranking
7    from   emp
8  )
where ranking = 4 -- Replace 4 with any value of N
10  /
 
ENAME             SAL    RANKING
---------- ---------- ----------
BLAKE            2850          4
CLARK            2850          4

The 4th position has a tie between BLAKE and CLARK.

1 comment: