Interesting Query (3) --- Get Rolling Average

Note: this article is published on 06/19/2024.

Extend the article: SQL - Interesting Queries as a series of articles:

Introduction

SQL, the Structural Query Language, seems simple, but often with some tricks in them. This article will discuss a new issue. 

The content of this article:

  • Introduction
  • Question
  • Setup Table and Initial Data
  • Answers
    • Step 1 --- make a daily everage
    • Step 2 --- The rolling average algorithm
    • Step 3:--- Get Rolling Average by SubQueries
    • Step 4:--- Get Rolling Average by a temp table 

Question:

We have a table with three columns: Date, Currency, Rate, we need to get a output as

  • Date
  • Currency (Daily everage)
  • Rolling Average of the three days including the current day and two previous days

Set up Table and initial Data

Table RateCal:

USE [Test]
GO

/****** Object:  Table [dbo].[RateCal]    Script Date: 7/19/2024 2:56:04 PM ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[RateCal](
	[Date] [date] NOT NULL,
	[Currency] [money] NOT NULL,
	[Rate] [float] NOT NULL,
 CONSTRAINT [PK_RateCal] PRIMARY KEY CLUSTERED 
(
	[Date] ASC,
	[Currency] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO

The Table Structure:

Test Data:

Answers:

Step 1 --- make a daily everage

We realize the records could be more than one during a day, so the first step is to make daily everage:

We can eazily verify the result is correct, say, for 1/16, there are three values: 11.2, 11.3, 11.4, the everage is 11.3 that is correct.

Step 2 --- The rolling average algorithm

Rolling Average, also known as a moving average, is a statistical calculation used to analyze data over a sequence of consecutive periods.

For the given question, we have made the daily average, then the rolling average will be against the new created daily average that could be a subquery. To make the analysis simple, we make another data table, to simulate the rolling everage logic first and independently. To do this, make a simple table: RateCal1 as

The data is as simple as

Now we make a rolling everage:

We can eazily verify the result is correct, say, for 1/16, the rolling everage value is 14.6, that is calculated from the three records starting from 1/16 plus the two previous, that are 13.6, 14.5, and 15.7, respectively. The everage is just 14.6.

Logic analysis:

SELECT A.date, avg(B.rate)
  FROM dbo.RateCal1 A
  INNER Join dbo.RateCal1 as B
  On B.date between dateadd(day, -2, A.Date) and A.Date
  Group by A.Date

Examine the code, we made a self join for the Table RateCal1, For each row in our table, we join every row that was within the past 3 days and take the average.

Step 3:--- Get Rolling Average by SubQueries

We combine the Step 1 and Step 2, we have the code:

  SELECT A.Date, AVG(B.AvgRate)
  FROM
  ( -- subQuery
	  SELECT TOP 100 percent [Date]
		,AVG([Currency]) as AvgCurrency
		,AVG([Rate]) as AvgRate
	  FROM [Test].[dbo].[RateCal]
	  GROUP BY [Date]
	  ORDER BY [Date]
  ) A
  INNER Join 
  ( -- the same subQuery
	  SELECT TOP 100 percent [Date]
		,AVG([Currency]) as AvgCurrency
		,AVG([Rate]) as AvgRate
	  FROM [Test].[dbo].[RateCal]
	  GROUP BY [Date]
	  ORDER BY [Date]
  ) B
  On B.date between dateadd(day, -2, A.Date) and A.Date
  Group by A.Date

Although the code is a little bit ugly because one subQuery is repeated twice (A and B), the result is correct:

Note:

There are some details about SubQuery, we will discuss in another article.

Step 4:--- Get Rolling Average by a temp table

We combine the Step 1 and Step 2, with a temp table:

DROP TABLE IF EXISTS #Temp

SELECT
	[Date]
	,AVG([Currency]) as AvgCurrency
	,AVG([Rate]) as AvgRate
INTO #Temp
	FROM [Test].[dbo].[RateCal]
	GROUP BY [Date]
	ORDER BY [Date]

SELECT A.date, avg(B.AvgRate)
  FROM #Temp A
  INNER Join #Temp as B
  On B.date between dateadd(day, -2, A.Date) and A.Date
  Group by A.Date

The result is the same, but the coding is more elegent:

 

References:


Similar Articles