In this post, I am going to demonstrate how to calculate a moving average in SQL.

You also need to know a bout SQL Window Functions. If you don't, you should read my article here

To illustrate how to calculate a moving average, I'll use the DVDrental database that can be downloaded here

We have a payment table storing all the payments made by our customers.

``````# \d+ payment
Table "public.payment"
Column    |            Type             |
--------------+-----------------------------+
payment_id   | integer                     |
customer_id  | smallint                    |
staff_id     | smallint                    |
rental_id    | integer                     |
amount       | numeric(5,2)                |
payment_date | timestamp without time zone |
``````

I want to calculate the moving average of the amount over a 5 days period.

First, I calculate the total amount per day in a Common Table Expression (CTE) using a `WITH` statement.

``````WITH daily_amount AS (
SELECT
payment_date::date AS payment_date
, SUM(amount) AS amount
FROM
payment
GROUP BY
payment_date::date
)
``````

Next, I calculate the moving average on the amount using a window function:

``````SELECT
payment_date
, amount
, AVG(amount) OVER(ROWS 5 PRECEDING) AS MA5
FROM
daily_amount
;
payment_date | amount  |          ma5
--------------+---------+-----------------------
2007-02-14   |  116.73 |  116.7300000000000000
2007-02-19   | 1290.90 |  703.8150000000000000
2007-02-20   | 1219.09 |  875.5733333333333333
2007-03-19   | 2617.69 | 1311.1025000000000000
2007-04-26   |  347.21 | 1118.3240000000000000
2007-04-08   | 2227.84 | 1303.2433333333333333
2007-02-15   | 1188.92 | 1481.9416666666666667
2007-04-28   | 2622.73 | 1703.9133333333333333
``````

SQL returns an average for the first four rows. When we calculate a moving average, the first rows should be empty because we don't have enough values to calculate the average. In my example, I need to return Null for the first four rows. To do so, I'll use a `CASE WHEN ELSE` statement. This statement is the SQL equivalent of `if then else`. I also use the `ROW_NUMBER()` window function to get the number of each row of the result set. Then if `ROW_NUMBER()` is more than 4 then the query should return the average of the amount of the last 5 rows. If the `ROW_NUMBER()` is less than 4, the query should return `Null`. Finaly, I round the average to two decimals using the `ROUND()` function. This is how I write the `CASE` statement:

``````CASE
WHEN ROW_NUMBER() OVER() > 4 THEN ROUND(AVG(amount) OVER(ROWS 5 PRECEDING), 2)
ELSE Null
END AS MA5
``````

The Final query looks like this:

``````  WITH cte_daily_amount AS (
SELECT
payment_date::date AS payment_date
, SUM(amount) AS amount
FROM
payment
GROUP BY
payment_date::date
)
SELECT
payment_date
, amount
, CASE
WHEN ROW_NUMBER() OVER() > 4 THEN ROUND(AVG(amount) OVER(ROWS 5 PRECEDING), 2)
ELSE Null
END AS MA5
FROM
cte_daily_amount
;
payment_date | amount  |   ma5
---------------+---------+---------
2007-02-14    |  116.73 |
2007-02-19    | 1290.90 |
2007-02-20    | 1219.09 |
2007-03-19    | 2617.69 |
2007-04-26    |  347.21 | 1118.32
2007-04-08    | 2227.84 | 1303.24
2007-02-15    | 1188.92 | 1481.94
2007-04-28    | 2622.73 | 1703.91
[...]
``````