Understanding ROW_NUMBER() in SQL Window Functions

Introduction

In this article, we will explore ROW_NUMBER() in detail, using examples. Window functions are a powerful feature in SQL that allow you to perform calculations across a set of rows that are related to the current row. One of the most commonly used window functions is ROW_NUMBER().

What are Window Functions?

Let's first understand what window functions are and how they work.

Window functions perform calculations across a set of table rows that are somehow related to the current row. They are similar to aggregate functions, but unlike aggregate functions, they don't cause rows to become grouped into a single output row.

Now, let's set up a sample table to understand the ROW_NUMBER() Window function.

-- Create a table of students
CREATE TABLE students (
    student_id INT PRIMARY KEY,
    name VARCHAR(50),
    class INT,
    marks INT
);

-- Insert sample data
INSERT INTO students (student_id, name, class, marks) VALUES
(1, 'Aarav Patel', 10, 85),
(2, 'Diya Sharma', 10, 92),
(3, 'Arjun Singh', 10, 78),
(4, 'Ananya Gupta', 11, 88),
(5, 'Rohan Verma', 11, 95),
(6, 'Ishaan Kumar', 11, 82),
(7, 'Lokendra Singh', 12, 91),
(8, 'Vihaan Chopra', 12, 89),
(9, 'Aisha Bose', 12, 94);

Understanding ROW_NUMBER()

ROW_NUMBER() is a window function that assigns a unique integer value to each row within the partition of a result set. The numbering starts at 1 and continues incrementally for each additional row.

Basic Syntax of ROW_NUMBER()

ROW_NUMBER() OVER (
    [PARTITION BY col1, col2, ...]
    [ORDER BY col1, col2, ...]
)

Let's break down the above syntax:

  • PARTITION BY (optional): This clause divides the result set into partitions. ROW_NUMBER() is applied to each partition separately.
  • ORDER BY (optional): This specifies the order in which the row numbers should be assigned within each partition.

1. Basic ROW_NUMBER()

SELECT 
    name,
    class,
    marks,
    ROW_NUMBER() OVER (ORDER BY marks DESC) AS rank
FROM 
    students;

/*
Output:

name            class   marks   rank
Rohan Verma     11      95      1
Aisha Bose      12      94      2
Diya Sharma     10      92      3
Lokendra Singh  12      91      4
Vihaan Chopra   12      89      5
Ananya Gupta    11      88      6
Aarav Patel     10      85      7
Ishaan Kumar    11      82      8
Arjun Singh     10      78      9

*/

In the above query, ROW_NUMBER() assigns a unique number to each row, ordered by marks in descending order.

2. ROW_NUMBER() with PARTITION BY

Now, let's rank students within their own class:

SELECT 
    name,
    class,
    marks,
    ROW_NUMBER() OVER (PARTITION BY class ORDER BY marks DESC) AS class_rank
FROM 
students;

/* Output:

name            class   marks   class_rank
Diya Sharma     10      92      1
Aarav Patel     10      85      2
Arjun Singh     10      78      3
Rohan Verma     11      95      1
Ananya Gupta    11      88      2
Ishaan Kumar    11      82      3
Aisha Bose      12      94      1
Lokendra Singh  12      91      2
Vihaan Chopra   12      89      3

*/

In the above query, the PARTITION BY clause creates separate rankings for each class. The ORDER BY clause then determines the order of row numbers within each class partition.

Uses of ROW_NUMBER()

  1. Ranking: As shown in our examples, ROW_NUMBER() is excellent for creating rankings.
  2. Removing Duplicates: You can use ROW_NUMBER() to identify and remove duplicate rows.
  3. Pagination: ROW_NUMBER() can be used to implement pagination in SQL queries.
  4. Top N per Group: You can use ROW_NUMBER() to select the top N rows from each group in a dataset.

Summary

ROW_NUMBER() is a versatile window function that can significantly enhance your SQL queries. By understanding how to use ROW_NUMBER() with PARTITION BY and ORDER BY clauses, you can solve complex ranking and analytical problems with ease. Remember, window functions like ROW_NUMBER() operate on a set of rows and return a single value for each row. This makes them incredibly powerful for analytical queries where you need to perform calculations across a set of related rows.


Similar Articles