The CUME_DIST () are analytic functions in SQL Server 2012
The purpose of CUME_DIST () function is to calculate cumulative distribution value in a group of the values in a given result set or a partition.
The CUME_DIST () formula
“The values less than or equal to the current value row number” / “Total number of row in a group or partition”. The return type is float (53) and the values are always between 0 and 1.
The syntax of CUME_DIST () is given below.
- CUME_DIST ( )
- OVER ([ partition_by_clause ] order_by_clause )
Example 1 CUME_DIST ()
- CREATE TABLE [CompanyShareRates]
- (
- [Id] INT IDENTITY,
- [Date] DATETIME,
- [Name] VARCHAR(100),
- [ShareRate] NUMERIC(18,2)
- )
-
- INSERT INTO [CompanyShareRates] VALUES('2017-02-03','TCS',200.00)
- INSERT INTO [CompanyShareRates] VALUES('2017-02-04','TCS',210.00)
- INSERT INTO [CompanyShareRates] VALUES('2017-02-05','TCS',230.00)
- INSERT INTO [CompanyShareRates] VALUES('2017-02-06','TCS',230.00)
- INSERT INTO [CompanyShareRates] VALUES('2017-02-07','TCS',250.50)
-
-
- SELECT
- ROW_NUMBER() OVER (ORDER BY [Name],[ShareRate]) AS [Row Number]
- ,[Date]
- ,[NAME]
- ,[ShareRate]
- ,CUME_DIST () OVER (ORDER BY [Name],[ShareRate]) AS [CUME_DIST]
- FROM [CompanyShareRates]
Explanation
CUME_DIST is calculated in Excel.
In ‘TCS’ group given above, we have 4 share rates in 5 rows.
Column E
For the 1st row, there is 1 row with value (200) or lower. Hence, the ‘E’ Column value will be 1.
For the 2nd row, there are 2 rows with value (210) or lower. Hence, the ‘E’ Column value will be 2.
For the 3rd and 4th rows, there are 4 rows with value (230) or lower. Hence, the ‘E’ Column value will be 4 for both the row.
For the 5th row, there are 5 rows with the value (250.5) or lower. Hence, the ‘E’ Column value will be 5.
Column F
Total number of rows in result set. i.e. -- 5
Column G
- CMUE_DIST for row 1 = 1/5 = 0.2
- CMUE_DIST for row 2 = 2/5 = 0.4
- CMUE_DIST for row 3 = 4/5 = 0.8
- CMUE_DIST for row 4 = 4/5 = 0.8
- CMUE_DIST for row 5 = 5/5 = 1
Example 2 CUME_DIST () with Partition by Clause
- INSERT INTO [CompanyShareRates] VALUES('2017-02-03','Hero Motocorp',390.50)
- INSERT INTO [CompanyShareRates] VALUES('2017-02-03','GAIL',302.00)
- INSERT INTO [CompanyShareRates] VALUES('2017-02-04','Hero Motocorp',391.00)
- INSERT INTO [CompanyShareRates] VALUES('2017-02-04','GAIL',302.25)
- INSERT INTO [CompanyShareRates] VALUES('2017-02-05','Hero Motocorp',390.00)
- INSERT INTO [CompanyShareRates] VALUES('2017-02-05','GAIL',203.25)
- INSERT INTO [CompanyShareRates] VALUES('2017-02-06','Hero Motocorp',389.50)
- INSERT INTO [CompanyShareRates] VALUES('2017-02-06','GAIL',401.00)
- INSERT INTO [CompanyShareRates] VALUES('2017-02-07','Hero Motocorp',389.00)
- INSERT INTO [CompanyShareRates] VALUES('2017-02-07','GAIL',401.00)
-
-
- SELECT
- ROW_NUMBER() OVER (Partition by Name ORDER BY [Name],[ShareRate]) AS [Row Number]
- ,[Date]
- ,[NAME]
- ,[ShareRate]
- ,CUME_DIST () OVER (Partition by Name ORDER BY [Name],[ShareRate]) AS [CUME_DIST]
- FROM [CompanyShareRates]