Emulating Analytic (AKA Ranking) Functions with MySQL
Pages: 1, 2, 3, 4
We must, though, be careful about one thing: the analytic function applies to the result set defined by the query in which it appears. This means that if we apply a restrictive condition, the sum as computed by the analytic function changes:
SQL> select deptno, ename, sal, sum(sal) over (partition by deptno)
2 from emp
3 where job not in ('PRESIDENT', 'MANAGER')
4 order by 1, 3
5 /
DEPTNO ENAME SAL SUM(SAL)OVER(PARTITIONBYDEPTNO)
---------- ---------- ---------- -------------------------------
10 MILLER 1300 1300
20 SMITH 800 7900
20 ADAMS 1100 7900
20 SCOTT 3000 7900
20 FORD 3000 7900
30 JAMES 950 6550
30 WARD 1250 6550
30 MARTIN 1250 6550
30 TURNER 1500 6550
30 ALLEN 1600 6550
10 rows selected.
SQL>
If we want to have the same result in our MySQL implementation, we must apply the restrictive condition on the job description twice, once when computing the aggregate, and once when returning the individual rows. So, those highly praised analytic functions are just shortcuts for joining a table to an aggregate over the same table? Big deal! Well, not quite. Things get a little more complicated when we introduce a minor twist: relating rows in the "window" defined by the over clause to each other, which is usually performed by adding an order by to the clause, and optionally a range restriction.
Adding Ordering to the OVER Clause
SUM() is not the best function to apply an order by to, because whatever the order of the rows, the sum remains the same. It is a different matter when you want to compare rows within a window to each other, for instance ranking them with regard to a particular column. Oracle allows you to use regular aggregate functions in an analytical way, but it also introduces a number of purely analytic functions, of which RANK() can be considered to be the archetype. Actually, the typical question that calls for an analytic function is along the lines of "what are the top two salaries per department?". This is a question that calls for ranking rows relative to each other (hence the name of ranking function used with SQL Server 2005).
The per department in the previous question clearly tells us that we must partition by department. We could have omitted the partition by to apply the function to the whole of the table, but then a simple order by and a limit clause would have done the trick in MySQL. What allows us to assign a particular rank is the salary, which is what we must order by.
SQL> select deptno, empno, ename, sal, rank() over (partition by deptno
2 order by sal desc) "RANK"
3 from emp;
DEPTNO EMPNO ENAME SAL RANK
---------- ---------- ---------- ---------- ----------
10 7839 KING 5000 1
10 7782 CLARK 2450 2
10 7934 MILLER 1300 3
20 7788 SCOTT 3000 1
20 7902 FORD 3000 1
20 7566 JONES 2975 3
20 7876 ADAMS 1100 4
20 7369 SMITH 800 5
30 7698 BLAKE 2850 1
30 7499 ALLEN 1600 2
30 7844 TURNER 1500 3
30 7654 MARTIN 1250 4
30 7521 WARD 1250 4
30 7900 JAMES 950 6
14 rows selected.
SQL>


