Tuesday, 3 September 2019

Analytical Functions in Oracle

We use analytical functions instead writing sub queries and joins where requirement comes along with aggregation.

Analytical functions extends the power of SQL, which means we can write queries in less lines of code with analytical function when compared with Native SQL.

Lets start with Aggregate functions (MIN, MAX, AVG,COUNT,SUM) to explore analytical  functions.

Note: We are going to use EMP table which is present in SCOTT schema for examples.


SQL> select sum(sal),deptno from emp
  2  group by deptno;
  SUM(SAL) DEPTNO
---------- ------
      9400     30
      8750     10
     10875     20


Now we will write above query using Analytical function.


SQL> select deptno, sum(sal)over(partition by deptno ) as grp from emp;
DEPTNO        GRP
------ ----------
    10       8750
    10       8750
    10       8750
    20      10875
    20      10875
    20      10875
    20      10875
    20      10875
    30       9400
    30       9400
    30       9400
    30       9400
    30       9400
    30       9400
14 rows selected



as we observed both queries output, Analytic functions also operate on subsets of rows, similar to aggregate functions in GROUP BY queries, but they didn't reduce the number of rows returned by the query.


Syntax:

 Lets understand syntax here

analytic_function([ arguments ]) OVER (analytic_clause)

 Here  "analytic_function" can be any aggregate function or any other analytic function(eg: lead,lag,dense_rank etc)

 "Arguments" --> Based on requirement we pass arguments to analytic function. eg: in above query we passed "sal".


The analytic_clause breaks down into the following optional elements.

[ query_partition_clause ] [ order_by_clause [ windowing_clause ] ]


query_partition_clause

The query_partition_clause divides the result set into partitions, or groups, of data. The operation of the analytic function is restricted to the boundary imposed by these partitions, similar to the way a GROUP BY clause affects the action of an aggregate function. If the query_partition_clause is omitted, the whole result set is treated as a single partition. The following query uses an empty OVER clause, so the sum presented is based on all the rows of the result set.

 SQL> select deptno, sum(sal)over() as grp from emp;


DEPTNO        GRP
------ ----------
    10      29025
    30      29025
    10      29025
    20      29025
    20      29025
    20      29025
    20      29025
    30      29025
    30      29025
    30      29025
    30      29025
    20      29025
    30      29025
    10      29025
14 rows selected


order_by_clause:
Its optional, we can use whenever want to sort records.

Windowing_clause


 We have seen previously the query_partition_clause controls the window, or group of rows, the analytic operates on. The windowing_clause gives some analytic functions a further degree of control over this window within the current partition, or whole result set if no partitioning clause is used. The windowing_clause is an extension of the order_by_clause and as such, it can only be used if an order_by_clause is present. The windowing_clause has two basic forms.


RANGE BETWEEN start_point AND end_point
ROWS BETWEEN start_point AND end_point

When using ROWS BETWEEN, you are indicating a specific number of rows relative to the current row, either directly, or via an expression. Assuming you don't cross a partition boundary, that number of rows is fixed. In contrast, when you use RANGE BETWEEN you are referring to a range of values in a specific column relative to the value in the current row. As a result, Oracle doesn't know how many rows are included in the range until the ordered set is created.

Possible values for "start_point" and "end_point" are:

UNBOUNDED PRECEDING : The window starts at the first row of the partition, or the whole result set if no partitioning clause is used. Only available for start points.
UNBOUNDED FOLLOWING : The window ends at the last row of the partition, or the whole result set if no partitioning clause is used. Only available for end points.
CURRENT ROW : The window starts or ends at the current row. Can be used as start or end point.
value_expr PRECEDING : A physical or logical offset before the current row using a constant or expression that evaluates to a positive numerical value. When used with RANGE, it can also be an interval literal if the order_by_clause uses a DATE column.
value_expr FOLLOWING : As above, but an offset after the current row.





No comments:

Post a Comment