Introduction
A window function is a special type of function that allows you to perform calculations on a specific subset, or "window," of rows from a result set. It's like looking at a window into your data and performing calculations on that smaller portion. Window functions are different from regular aggregate functions like SUM or AVG because they calculate values for each row individually, taking into account the rows within the defined window. They can be used to perform various calculations, such as aggregations, ranking, and statistical operations, without the need for grouping the data or using subqueries.
There are several types of window functions commonly used in SQL. Here are some of the most frequently used ones:
Aggregate Window Functions
These functions perform calculations on a subset of rows within a window and return a single aggregated result.
SUM(): Calculates the sum of a column within the window.
//Calculate the total revenue for each region, including a column with the overall average revenue.
SELECT
region,
SUM(revenue) AS total_revenue,
AVG(revenue) OVER () AS overall_avg_revenue
FROM
sales
GROUP BY
region;
AVG(): Computes the average of a column within the window.
SELECT
product_id,
product_name,
category,
price,
AVG(price) OVER (PARTITION BY category) AS avg_price
FROM
products;
COUNT(): Counts the number of rows within the window.
SELECT
product_id,
product_name,
category,
price,
COUNT(*) OVER (PARTITION BY category) AS category_count
FROM
products;
MAX(): Finds the maximum value within the window.
SELECT
product_id,
product_name,
category,
price,
MAX(price) OVER (PARTITION BY category) AS max_price
FROM
products;
MIN(): Finds the minimum value within the window.
SELECT
emp_id,
emp_name,
department,
salary,
MIN(salary) OVER (PARTITION BY department) AS min_salary
FROM
employees;
Ranking Window Functions
These functions assign a rank or position to each row within a window based on a specified criterion.
ROW_NUMBER(): Assigns a unique number to each row within the window.
SELECT
ROW_NUMBER() OVER (ORDER BY department) AS row_num,
employee_id,
employee_name,
department,
salary
FROM
employee
ORDER BY
department;
RANK(): Assigns a rank to each row, with gaps in case of ties.
SELECT
RANK() OVER (ORDER BY price DESC) AS product_rank,
product_id,
product_name,
category,
price
FROM
products;
DENSE_RANK(): Assigns a rank to each row without gaps in case of ties.
SELECT
sale_id,
product_name,
category,
sale_amount,
DENSE_RANK() OVER (ORDER BY sale_amount DESC) AS dense_rank
FROM
sales;
NTILE(): Divides the rows into specified buckets or percentiles.
SELECT
student_id,
student_name,
score,
NTILE(3) OVER (ORDER BY score DESC) AS tile_number
FROM
students;
Analytic Window Functions
These functions provide additional analytical capabilities and often require both partitioning and ordering of rows.
LAG(): Retrieves the value from a previous row within the window.
SELECT
order_id,
customer_id,
order_date,
order_total,
LAG(order_total, 1, 0) OVER (PARTITION BY customer_id ORDER BY order_date) AS previous_order_total
FROM
orders;
LEAD(): Retrieves the value from a subsequent row within the window.
SELECT
employee_id,
first_name,
last_name,
department,
salary,
LEAD(salary, 1, 0) OVER (PARTITION BY department ORDER BY employee_id) AS next_salary
FROM
employees;
FIRST_VALUE(): Returns the value of a specified expression from the first row in the window.
SELECT
employee_id,
first_name,
last_name,
department,
salary,
FIRST_VALUE(salary) OVER (PARTITION BY department ORDER BY employee_id) AS first_salary
FROM
employees;
LAST_VALUE(): Returns the value of a specified expression from the last row in the window.
SELECT
employee_id,
first_name,
last_name,
department,
salary,
LAST_VALUE(salary) OVER (PARTITION BY department ORDER BY employee_id
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_salary
FROM
employees;
Here are some examples that aim to provide valuable insights and enhance your understanding of Window functions effectively
Calculate the total revenue for each region, including a column with the overall average revenue.
SELECT
region,
SUM(revenue) AS total_revenue,
AVG(revenue) OVER () AS overall_avg_revenue
FROM
sales
GROUP BY
region;
Rank products based on their sales quantities within each category.
SELECT
category,
product,
sales_quantity,
RANK() OVER (PARTITION BY category ORDER BY sales_quantity DESC) AS product_rank
FROM
sales
ORDER BY
category, product_rank;
Calculate the running total of sales revenue for each day.
SELECT
sales_date,
SUM(revenue) OVER (ORDER BY sales_date) AS running_total
FROM
daily_sales
ORDER BY
sales_date;
Determine the percentage of total sales revenue contributed by each product within its category.
SELECT
category,
product,
revenue,
revenue / SUM(revenue) OVER (PARTITION BY category) * 100 AS revenue_percentage
FROM
sales
ORDER BY
category, product;
Find the highest revenue achieved in each quarter.
SELECT
DATE_TRUNC('quarter', sales_date) AS quarter,
MAX(revenue) AS highest_revenue
FROM
sales
GROUP BY
quarter
ORDER BY
quarter;
Calculate the average rating of movies within each genre.
SELECT
genre,
AVG(rating) AS average_rating
FROM
movies
GROUP BY
genre;
Determine the difference in sales quantity between the current row and the previous row.
SELECT
order_date,
sales_quantity,
LAG(sales_quantity) OVER (ORDER BY order_date) AS previous_sales_quantity,
sales_quantity - LAG(sales_quantity) OVER (ORDER BY order_date) AS sales_quantity_difference
FROM
sales;
Rank customers based on their total purchase amounts.
SELECT
customer_id,
total_purchase_amount,
RANK() OVER (ORDER BY total_purchase_amount DESC) AS purchase_rank
FROM
(
SELECT
customer_id,
SUM(purchase_amount) AS total_purchase_amount
FROM
purchases
GROUP BY
customer_id
) AS purchase_summary;
Calculate the cumulative percentage of sales revenue for each product.
SELECT
product_id,
sales_revenue,
SUM(sales_revenue) OVER (ORDER BY sales_revenue DESC) AS cumulative_revenue,
SUM(sales_revenue) OVER (ORDER BY sales_revenue DESC) / SUM(sales_revenue) OVER () * 100 AS cumulative_percentage
FROM
sales;
Determine the average salary of employees within each department.
SELECT
department,
salary,
AVG(salary) OVER (PARTITION BY department ORDER BY department) AS average_salary
FROM
employees
ORDER BY
department;
Calculate the moving average of sales quantities over a specific period.
SELECT
Date,
SalesQuantity,
AVG(SalesQuantity) OVER (ORDER BY Date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS MovingAverage
FROM
Sales
ORDER BY
Date;
Rank students based on their exam scores within each subject.
SELECT
Subject,
StudentName,
ExamScore,
RANK() OVER (PARTITION BY Subject ORDER BY ExamScore DESC) AS SubjectRank
FROM
ExamScores;
Determine the percentage growth in revenue compared to the previous year for each quarter.
SELECT
Year,
Quarter,
Revenue,
(Revenue - LAG(Revenue) OVER (ORDER BY Year, Quarter)) / LAG(Revenue) OVER (ORDER BY Year, Quarter) * 100 AS RevenueGrowth
FROM
RevenueData;
Find the top-selling product within each category.
SELECT
Category,
Product,
SalesQuantity,
RANK() OVER (PARTITION BY Category ORDER BY SalesQuantity DESC) AS ProductRank
FROM
SalesData;
Calculate the median salary of employees within each department.
SELECT
Department,
Salary,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY Salary) OVER (PARTITION BY Department) AS MedianSalary
FROM
EmployeeData;
Determine the difference in sales revenue between the current row and the next row.
SELECT
Date,
Revenue,
LEAD(Revenue) OVER (ORDER BY Date) - Revenue AS RevenueDifference
FROM
SalesData;
Rank cities based on their population density.
SELECT
City,
PopulationDensity,
RANK() OVER (ORDER BY PopulationDensity DESC) AS CityRank
FROM
CityData;
Calculate the cumulative sum of sales quantities for each product.
SELECT
Product,
SalesQuantity,
SUM(SalesQuantity) OVER (PARTITION BY Product ORDER BY Date) AS CumulativeSalesQuantity
FROM
SalesData;
Determine the percentage of customers who made a repeat purchase within each month.
SELECT
Month,
COUNT(DISTINCT CustomerID) AS TotalCustomers,
COUNT(DISTINCT CASE WHEN RepeatPurchase = 1 THEN CustomerID END) / COUNT(DISTINCT CustomerID) * 100 AS RepeatPurchasePercentage
FROM
PurchaseData
GROUP BY
Month;
Rank employees based on their performance scores within each department.
SELECT
Department,
Employee,
PerformanceScore,
RANK() OVER (PARTITION BY Department ORDER BY PerformanceScore DESC) AS EmployeeRank
FROM
EmployeeData;
Calculate the average order value for each customer, including the overall average order value.
SELECT
CustomerID,
OrderValue,
AVG(OrderValue) OVER (PARTITION BY CustomerID) AS AverageOrderValue,
AVG(OrderValue) OVER () AS OverallAverageOrderValue
FROM
OrderData;
Determine the difference in ratings between the current movie and the highest-rated movie within its genre.
SELECT
Movie,
Genre,
Rating,
MAX(Rating) OVER (PARTITION BY Genre) - Rating AS RatingDifference
FROM
MovieData;
Calculate the maximum temperature recorded in each month.
SELECT
EXTRACT(MONTH FROM Date) AS Month,
MAX(Temperature) AS MaxTemperature
FROM
WeatherData
GROUP BY
EXTRACT(MONTH FROM Date);
Determine the percentage of total sales revenue contributed by each customer.
SELECT
CustomerID,
SalesRevenue,
SalesRevenue / SUM(SalesRevenue) OVER () * 100 AS RevenuePercentage
FROM
SalesData;
Rank countries based on their GDP per capital.
SELECT
Country,
GDPperCapita,
RANK() OVER (ORDER BY GDPperCapita DESC) AS CountryRank
FROM
CountryData;
Calculate the running total of customer orders within each week.
SELECT
Week,
CustomerID,
OrderCount,
SUM(OrderCount) OVER (PARTITION BY Week ORDER BY CustomerID) AS RunningTotal
FROM
OrderData;
Calculate the average rating of movies released each year.
SELECT
EXTRACT(YEAR FROM ReleaseDate) AS Year,
AVG(Rating) AS AverageRating
FROM
MovieData
GROUP BY
EXTRACT(YEAR FROM ReleaseDate);
Determine the percentage of total revenue achieved by each salesperson.
SELECT
Salesperson,
Revenue,
Revenue / SUM(Revenue) OVER () * 100 AS RevenuePercentage
FROM
SalesData;
Calculate the cumulative product of sales quantities for each product.
SELECT
Product,
SalesQuantity,
EXP(SUM(LOG(SalesQuantity)) OVER (PARTITION BY Product ORDER BY Date)) AS CumulativeProduct
FROM
SalesData;
Determine the difference in population between the current city and the next city.
SELECT
City,
Population,
LEAD(Population) OVER (ORDER BY City) - Population AS PopulationDifference
FROM
CityData;
Determine the percentage decrease in sales revenue compared to the previous month for each month.
SELECT
Month,
SalesRevenue,
(SalesRevenue - LAG(SalesRevenue) OVER (ORDER BY Month)) / LAG(SalesRevenue) OVER (ORDER BY Month) * 100 AS RevenueDecreasePercentage
FROM
SalesData;
Find the highest-rated movie within each genre.
SELECT
Genre,
Movie,
Rating
FROM (
SELECT
Genre,
Movie,
Rating,
RANK() OVER (PARTITION BY Genre ORDER BY Rating DESC) AS MovieRank
FROM
MovieData
) AS RankedMovies
WHERE
MovieRank = 1;
Calculate the median age of employees within each department.
SELECT
Department,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY Age) OVER (PARTITION BY Department) AS MedianAge
FROM
EmployeeData;
Calculate the cumulative count of unique customers for each product.
SELECT
Product,
CustomerID,
COUNT(DISTINCT CustomerID) OVER (PARTITION BY Product ORDER BY Date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS CumulativeUniqueCustomers
FROM
SalesData;
Determine the percentage increase in sales quantity compared to the previous month for each month.
SELECT
Month,
SalesQuantity,
(SalesQuantity - LAG(SalesQuantity) OVER (ORDER BY Month)) / LAG(SalesQuantity) OVER (ORDER BY Month) * 100 AS QuantityIncreasePercentage
FROM
SalesData;
Find the top-performing employee within each department.
SELECT
Department,
Employee,
PerformanceScore
FROM (
SELECT
Department,
Employee,
PerformanceScore,
RANK() OVER (PARTITION BY Department ORDER BY PerformanceScore DESC) AS EmployeeRank
FROM
EmployeeData
) AS RankedEmployees
WHERE
EmployeeRank = 1;
Determine the difference in customer satisfaction ratings between the current row and the highest-rated row within each department.
SELECT
Department,
CustomerID,
SatisfactionRating,
SatisfactionRating - MAX(SatisfactionRating) OVER (PARTITION BY Department) AS RatingDifference
FROM
CustomerData;
These are just a few examples of the types of window functions available in SQL. The specific functions and syntax may vary depending on the database system you are using. Window functions provide a powerful toolset for performing complex calculations and analysis within SQL queries.
Thank you for reading, and I hope this post has helped provide you with a better understanding of the Window Function in SQL.