Ranking functions provide a very good feature of assigning numbering to the records in the result set in SQL. Row_Number in SQL is one of these functions available in that allows us to assign rankings or numbering to the rows of the result set data. Different values are assigned to different rows, based on the type of ranking function used. They are basically of the following types:
- Row_Number
- Rank
- Dense_Rank
- NTile
In this article, we discuss the use of the Row_Number in SQL.
To understand this concept of ranking, we need to first understand the syntax of it. The syntax is :
- Rank() Over ( {Partition_By_Clause} Order_By_Clause)
First of all, Rank() is the function to assign the numbering or ranking to the result set data. Next, the Over keyword is used along with the conditions specified in the '(' brackets (in other words the Partition and Order by clause), to decide how the ranking or numbering is started and implemented on the result set data. Here, these two keywords are also very important to be understood, before we proceed.
Order_By_Clause : This clause decides how the ranking or numbering of the data starts. For example, if we specify Order By EmployeeID desc, it will result in ordering the data in descending order by EmployeeID and then assign the rank to the records, starting from 1 to n, based on the type of ranking function we use (in other words rank, dense rank and so on).
Partition_By_Clause : This clause basically provides the grouping functionality on the result set data. When used in combination with the order by clause, it results in grouping the data and then the ranking of the rows is done, based on the type of ranking function we use (in other words rank, dense rank and so on).
So its the over keyword, along with these two clauses, that handle the ranking process of the records, when using the ranking functions.
The Row_Number function
This function works by assigning continuous ranking to the records, without skipping any number in the result set, whether it is partitioned or not. At the end of the discussion, we will see what we mean by continuous ranking and not skipping any record.
For this, we will create a table named DepartmentMembers, that stores names of the members, along with their DepartmentName. I know this is not a perfect database structure, but just a sample table with some records inserted into it, to discuss the concept. So our setup table will be like the following:
Next, we will be applying the Row_Number function with each two cases and will discuss them one by one. In the first case, we will apply only the order by clause and in the second case, we will add the partition by clause, along with the order by clause.
The main point here is that the order by clause is a must for applying the ranking functions. So let's start.
Case 1: Ranking entire result set using Order By clause
Based on the syntax above and data created in the sample table, we will apply the Row_Number function now. So our query becomes:
- SELECT ROW_NUMBER() OVER (ORDER BY DepartmentName DESC) AS GeneratedRank, MemberName, Gender, DepartmentName
- FROM dbo.DepartmentMembers
Now execute this query and see the results.
What this query has done is that, it selects the results from the table and orders them by DepartmentName in descending order. Further, it generates the rank or numbering for the records as we can see in the GeneratedRank column. So, simple numbering was generated for the records.
Case 2: Ranking partitioned result set using Partition By Clause
Next, we will add the Partition By clause to it. So the query now changes to:
- SELECT ROW_NUMBER() OVER (PARTITION BY Gender ORDER BY DepartmentName DESC) AS GeneratedRank, MemberName, Gender, DepartmentName
- FROM dbo.DepartmentMembers
Now run the query and see the results.
This time, the partition by clause grouped the results based on the Gender, ordered the data of the groups by DepartmentName in descending order and then finally applied the numbering or rank on the results, based on the ranking function used.
So in the example above, we can see that the ranking assigned to the records remains in continuous order, regardless of whether it was a complete result set or a partitioned result set.
Basically use of the order by clause only applies the numbering on the records or in other words, decides the sequence or starting point of the numbering. Further, the partition clause along with the order by, groups the records first and further applies the numbering based on the order by clause. I hope you enjoyed reading it.