SQL - Last 12 months to include empty months?

In an application I want to do a comparative ‘last 12 months’, and I have the following SQL to try and achieve that in MySQL. This works where all months have entries recorded, but I also want to include the ‘empty months’. Anyone have some suggestions on how to ‘capture/include’ the empty months.

I guess I could do it all programatically by brute force, i.e. work out 12 months ago, and step back through to present day, and do a month at a time, but wondering if it can all be done is SQL for speed and tidyness with the SQL working out what month was 12 months ago and working forward.

The sql currently looks like this

SELECT DATE_FORMAT(Entered, '%b') AS MONTH, DATE_FORMAT(Entered, '%Y') AS YEAR, SUM(Time_Minutes) AS total FROM rep_copy WHERE rep_copy.Entered <= NOW() AND rep_copy.Own_Time = '1' AND rep_copy.Entered >= Date_add(Now(), INTERVAL - 11 MONTH) AND rep_copy.Code = '52' GROUP BY DATE_FORMAT(Entered, '%Y-%m')

This returns the following (months with records in).

"Jul"	"2017"	"260"
"Aug"	"2017"	"855"
"Jan"	"2018"	"175"
"Feb"	"2018"	"135"
"Mar"	"2018"	"335"
"Apr"	"2018"	"170"
"May"	"2018"	"180"
"Jun"	"2018"	"30"

What I would actually like is

"Jul"	"2017"	"260"
"Aug"	"2017"	"855"
"Sep"	"2018"	"0"
"Oct"	"2018"	"0"
"Nov"	"2018"	"0"
"Dec"	"2018"	"0"
"Jan"	"2018"	"175"
"Feb"	"2018"	"135"
"Mar"	"2018"	"335"
"Apr"	"2018"	"170"
"May"	"2018"	"180"
"Jun"	"2018"	"30"

As always pointers and suggestions welcomed.

create an sub query that makes “dummy” records, with zero values
I’ve done this before, but the exact syntax will depend on mySQL

Thanks Dave but I am not sure how to do that :-(, any dialect of SQL would help, translation to mySQl won’t be a problem.

Stack Oveflow suggested the below which quite frankly went over my head!

t1.month,
t1.md,
coalesce(SUM(t1.amount+t2.amount), 0) AS total
from
(
  select DATE_FORMAT(a.Date,"%b") as month,
  DATE_FORMAT(a.Date, "%m-%Y") as md,
  '0' as  amount
  from (
    select curdate() - INTERVAL (a.a + (10 * b.a) + (100 * c.a)) DAY as Date
    from (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as a
    cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as b
    cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as c
  ) a
  where a.Date <= NOW() and a.Date >= Date_add(Now(),interval - 12 month)
  group by md
)t1
left join
(
  SELECT DATE_FORMAT(date, "%b") AS month, SUM(total_price) as amount ,DATE_FORMAT(date, "%m-%Y") as md
  FROM cart
  where Date <= NOW() and Date >= Date_add(Now(),interval - 12 month)
  GROUP BY md
)t2
on t2.md = t1.md 
group by t1.md
order by t1.md
;

I’m sure there are more efficient ways to do it, but I worked the problem in Postgres and this is what I came up with. (‘tmp’ is a schema we use for temporary tables that should be dropped when a process is done). Maybe this will give you some ideas.

begin;

--
-- Create some sample data
--
create table tmp.tester (
  id bigserial primary key,
  entered date,
  total real
);

insert into tmp.tester (
  entered,
  total
) values (
  '2018-01-01',
  1.0
), (
  '2018-01-15',
  1.0
), (
  '2018-04-01',
  1.0
);

--
-- Actual process starts here
--
create table tmp.months (
  m integer,
  y integer
);

--
-- Fill in months
--
do $$
declare
  date_start date;
  months integer;
  first_date date;
  index integer;
  this_date date;
begin
  date_start := '2018-01-15'::date; -- Would be supplied by a parameter somewhere
  months := 12; -- Also a parameter
  
  first_date := make_date (extract(year from date_start )::int, extract (month from date_start)::int, 1);
  for index in 0..months - 1 loop
    this_date := first_date + (index || ' months')::interval;
    insert into tmp.months (m, y) values (extract (month from this_date), extract (year from this_date));
  end loop;
end $$;

select
  mth.m as month,
  mth.y as year,
  coalesce (sum(total), 0.0)
from
  tmp.tester t
  right join tmp.months mth on (mth.m = extract (month from t.entered) and mth.y = extract (year from t.entered))
group by
  mth.m,
  mth.y
order by
  mth.y,
  mth.m
;

drop table tmp.months;

--
-- Actual process ends
--

--
-- Drop the sample data
--
drop table tmp.tester;

rollback;

The key is the “right join” that tells the query to create a row even if there is no data on the left side of the join.

Another idea: don’t create the months table when needs. Rather, pre-populate it with every month/year starting from your earliest “entered” and going through, say, the year 2100. In about 80 years, you can populate it again. :slight_smile:

Or let Xojo do the lifting and have it fill in a blank month when it encounters non-consecutive months in your recordset.

Yes, sql devs, I hear you screaming.

If you are only using this query from one environment (in this case, Xojo), that’s a fine alternative. If you need it to work across environments though…

This might be an efficient way to do this.

SELECT
SUM(IF(month = ‘Jan’, total, 0)) AS ‘Jan’,
SUM(IF(month = ‘Feb’, total, 0)) AS ‘Feb’,
SUM(IF(month = ‘Mar’, total, 0)) AS ‘Mar’,
SUM(IF(month = ‘Apr’, total, 0)) AS ‘Apr’,
SUM(IF(month = ‘May’, total, 0)) AS ‘May’,
SUM(IF(month = ‘Jun’, total, 0)) AS ‘Jun’,
SUM(IF(month = ‘Jul’, total, 0)) AS ‘Jul’,
SUM(IF(month = ‘Aug’, total, 0)) AS ‘Aug’,
SUM(IF(month = ‘Sep’, total, 0)) AS ‘Sep’,
SUM(IF(month = ‘Oct’, total, 0)) AS ‘Oct’,
SUM(IF(month = ‘Nov’, total, 0)) AS ‘Nov’,
SUM(IF(month = ‘Dec’, total, 0)) AS ‘Dec’
FROM (
SELECT DATE_FORMAT(rep_copy.Entered, “%b”) AS month, SUM(Time_Minutes) as total
FROM rep_copy
WHERE (rep_copy.Entered >= Date_add(Now(), interval - 11 month)) AND (rep_copy.Entered <= NOW()) AND
(rep_copy.Own_Time = ‘1’) AND (rep_copy.Code = ‘52’)
GROUP BY DATE_FORMAT(rep_copy.Entered, “%m-%Y”)) as sub

What if the range isn’t a calendar year?

If the date range doesn’t span more than one year then I think it will work (although the months won’t be in the correct sort order). If the range is over one year then it definitely won’t work.

Thanks for the suggestions, food for thought which I will try after the weekend.

Read this:
https://use-the-index-luke.com/blog/2011-07-30/mysql-row-generator

It should help you get some ideas.

In Postgresql I would do:

select myrange.monthname, myrange.year, case when total isnull then 0 else total end as total from (select extract(year from generate_series((date(Now()) - interval '21 months')::timestamp, now()::timestamp, '1 month'::interval))::integer as year, extract(month from generate_series((date(Now()) - interval '21 months')::timestamp, now()::timestamp, '1 month'::interval))::integer as imonth, to_char(generate_series((date(Now()) - interval '21 months')::timestamp, now()::timestamp, '1 month'::interval), 'Mon') as Monthname) as myrange left join ( SELECT to_char(entered, 'Mon') AS MONTH, extract(month from entered)::INTEGER as imonth, extract(year from entered)::INTEGER AS year, SUM(time_minutes) AS total FROM rep_copy WHERE entered <= NOW() AND own_time = '1' AND entered >= date(Now()) - interval '21 months' AND code = '52' GROUP BY extract(year from entered)::INTEGER, to_char(entered, 'Mon'), extract(month from entered)::INTEGER order by year, imonth) as rs on rs.year=myrange.year and rs.imonth=myrange.imonth;

Where myrange is a generated range of years/months and I then left join it with the data that is found in the recordset.
I used an interval of 21 months to test if it worked across years. The test seemed to work on my end.
Hopefully this combined with the article will help you along the way.

Thanks for all the suggestions, I learnt a lot along the way (and found some useful web resources thanks to you all) but decided to ‘cheat’ and use XOJO and mySQL to do it quick and dirty. As I am only dealing with a couple of thousand records at a time the speed is exceptional with XOJO so its not an issue even if its not as ‘neat’ as I started out.

Some of the code relates to a different bit of the method so ignore the unused variables. I make use of the Extract function in MySql, and the date function to get where to start and build from there. As far as I can tell, having tried various dates it seems to work for all.

With the resulting arrany I populate an Excel spreadsheet and graph.

[code]
dim typesarray(6,1) as Int64
dim c as integer
dim localsql as string
dim sql12 as RecordSet
dim months as string=“Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec”

//calculate last 12 months dates
dim d as new Date
dim mnths() as string
dim l as integer //loop 12 months
d.year=d.year-1 //last year same month as now
d.month=d.month
d.day=1

dim answers(12,3) as string
for l=0 to 11
d.month=d.month+1
//get work time
localsql=“select sum(Time_Minutes) as total FROM rtime WHERE extract(MONTH FROM Entered) = “+cstr(d.month)+” AND extract(YEAR FROM Entered) = “+cstr(d.year)+” and Own_Time=‘1’ and Code=‘52’”
sql12=db.SQLSelect(localsql)
dberror
answers(l,0)= nthfield(months,",",d.month)
answers(l,1)=cstr(d.year)
if val(sql12.field(“total”).StringValue)>0 then
answers(l,2)=sql12.field(“total”).StringValue
else
answers(l,2)=“0”
end if

//get work time
localsql=“select sum(Time_Minutes) as total FROM reptime WHERE extract(MONTH FROM Entered) = “+cstr(d.month)+” AND extract(YEAR FROM Entered) = “+cstr(d.year)+” and Own_Time=‘0’ and Code=‘52’”
sql12=db.SQLSelect(localsql)
dberror
if val(sql12.field(“total”).StringValue)>0 then
answers(l,3)=sql12.field(“total”).StringValue
else
answers(l,3)=“0”
end if
next[/code]