Note: this article is published on 06/19/2024.
Extend the article: SQL - Interesting Queries as a series of articles:
Introduction
SQL, the Structural Query Language, seems simple, but often with some tricks in them. This article will discuss a new issue.
The content of this article:
- Introduction
- Question
- Setup Table and Initial Data
- Answers
- Step 1 --- make a daily everage
- Step 2 --- The rolling average algorithm
- Step 3:--- Get Rolling Average by SubQueries
- Step 4:--- Get Rolling Average by a temp table
Question:
We have a table with three columns: Date, Currency, Rate, we need to get a output as
- Date
- Currency (Daily everage)
- Rolling Average of the three days including the current day and two previous days
Set up Table and initial Data
Table RateCal:
USE [Test]
GO
/****** Object: Table [dbo].[RateCal] Script Date: 7/19/2024 2:56:04 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[RateCal](
[Date] [date] NOT NULL,
[Currency] [money] NOT NULL,
[Rate] [float] NOT NULL,
CONSTRAINT [PK_RateCal] PRIMARY KEY CLUSTERED
(
[Date] ASC,
[Currency] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
The Table Structure:
Test Data:
Answers:
Step 1 --- make a daily everage
We realize the records could be more than one during a day, so the first step is to make daily everage:
We can eazily verify the result is correct, say, for 1/16, there are three values: 11.2, 11.3, 11.4, the everage is 11.3 that is correct.
Step 2 --- The rolling average algorithm
Rolling Average, also known as a moving average, is a statistical calculation used to analyze data over a sequence of consecutive periods.
For the given question, we have made the daily average, then the rolling average will be against the new created daily average that could be a subquery. To make the analysis simple, we make another data table, to simulate the rolling everage logic first and independently. To do this, make a simple table: RateCal1 as
The data is as simple as
Now we make a rolling everage:
We can eazily verify the result is correct, say, for 1/16, the rolling everage value is 14.6, that is calculated from the three records starting from 1/16 plus the two previous, that are 13.6, 14.5, and 15.7, respectively. The everage is just 14.6.
Logic analysis:
SELECT A.date, avg(B.rate)
FROM dbo.RateCal1 A
INNER Join dbo.RateCal1 as B
On B.date between dateadd(day, -2, A.Date) and A.Date
Group by A.Date
Examine the code, we made a self join for the Table RateCal1, For each row in our table, we join every row that was within the past 3 days and take the average.
Step 3:--- Get Rolling Average by SubQueries
We combine the Step 1 and Step 2, we have the code:
SELECT A.Date, AVG(B.AvgRate)
FROM
( -- subQuery
SELECT TOP 100 percent [Date]
,AVG([Currency]) as AvgCurrency
,AVG([Rate]) as AvgRate
FROM [Test].[dbo].[RateCal]
GROUP BY [Date]
ORDER BY [Date]
) A
INNER Join
( -- the same subQuery
SELECT TOP 100 percent [Date]
,AVG([Currency]) as AvgCurrency
,AVG([Rate]) as AvgRate
FROM [Test].[dbo].[RateCal]
GROUP BY [Date]
ORDER BY [Date]
) B
On B.date between dateadd(day, -2, A.Date) and A.Date
Group by A.Date
Although the code is a little bit ugly because one subQuery is repeated twice (A and B), the result is correct:
Note:
There are some details about SubQuery, we will discuss in another article.
Step 4:--- Get Rolling Average by a temp table
We combine the Step 1 and Step 2, with a temp table:
DROP TABLE IF EXISTS #Temp
SELECT
[Date]
,AVG([Currency]) as AvgCurrency
,AVG([Rate]) as AvgRate
INTO #Temp
FROM [Test].[dbo].[RateCal]
GROUP BY [Date]
ORDER BY [Date]
SELECT A.date, avg(B.AvgRate)
FROM #Temp A
INNER Join #Temp as B
On B.date between dateadd(day, -2, A.Date) and A.Date
Group by A.Date
The result is the same, but the coding is more elegent:
References: