SQL question

nicolás_canessa · August 2, 2019, 3:37pm

i have this table:

Table name? sales_per_month
year month category quantity
2018 12shirts 25
2018 12shoes 98
2018 12 umbrellas 2
2018 11 shirts 108
2018 11 shoes 12
2018 11 umbrellas 33
2018 10 shirts 59
2018 10 umbrellas 9

nicolás_canessa · August 2, 2019, 3:38pm

i have this table:

Table name? sales_per_month year month category quantity 2018 12 shirts 25 2018 12 shoes 98 2018 12 umbrellas 2 2018 11 shirts 108 2018 11 shoes 12 2018 11 umbrellas 33 2018 10 shirts 59 2018 10 umbrellas 9

and there is another table:

Table name? categories hats shirts shoes umbrellas

what i need is to show ALL categories regardless of the quantity. if there have been no sales in a certain category, I also want to show a record where the quantity is zero

The result I want to get is like the following:

expected result year month category quantity 2018 12 hats 0 * <- record i want to add 2018 12 shirts 25 2018 12 shoes 98 2018 12 umbrellas 2 2018 11 hats 0 * <- record i want to add 2018 11 shirts 108 2018 11 shoes 12 2018 11 umbrellas 33 2018 10 hats 0 * <- record i want to add 2018 10 shirts 59 2018 10 shoes 0 * <- record i want to add 2018 10 umbrellas 9

DaveS · August 2, 2019, 4:13pm

SELECT a.category, b.year,b.month,ifnull(b.quantity,0) as quantity
from categories
left join sales on a.category=b.categoy

nicolás_canessa · August 2, 2019, 8:01pm

thanks dave for the quick reply. i’ve tried that query but it doesn’t produce the desired result. it only adds the missing categories once, but what i need is to add the missing categories for each group (year, month)

Scott_Griffitts · August 2, 2019, 9:44pm

I think this matches your expected result:

select b.year, b.month, a.category, coalesce(c.quantity, 0) as quantity from categories a cross join sales_per_month b left join sales_per_month c on c.year = b.year and c.month = b.month and c.category = a.category group by b.year, b.month, a.category order by b.year desc, b.month desc, a.category

Maximilian_Tyrtania · August 3, 2019, 6:35am

The conventional approach to generate these report-like views is to generate the dates so that you can join them with your sales data. Are you using SQLite? then see: https://www.geekytidbits.com/generate-date-range-sqlite/ . In Postgres you’d use generate_series (as mentioned in the article, select i::date from generate_series(‘2019-01-29’,
‘2019-07-03’, ‘1 day’::interval) i for example does the trick). Or just create a table that holds all of the dates (or year and month numbers if that’s all you have) that you’re interested in and then left join all of the data.

Jeff_Tullin · August 3, 2019, 6:46am

You could also simply add rows to sales per month for the missing records, with a value of 0
It doesn’t scale to a database of several thousand product lines, but if the numbers are small…

Simply, at the start of each month you add one row for each product with a value of 0
Then proceed as normal, and use your current simple query.

nicolás_canessa · August 5, 2019, 3:31pm

thanks all for your inputs. i like jeff Tullin approach, but i don’t even know where to start… how could I possibly do that?

Scott Griffitts… i will try this approach, thanks

Jeff_Tullin · August 5, 2019, 5:26pm

When the app opens, check the year and month
If there are no rows in the database with the current row and month,
insert a row of 0 value for every item in the stock list.

eg if it is 2019 and month 11:

If the returned value is zero then:

Stéphane_Mons · August 5, 2019, 5:53pm

@Nicolas Canessa I think there is something fundamentally wrong in your database design, i.e. you should not have tables summarizing the contents of other tables (unless really necessary) because it is at best redundant, at worst false. Let me explain: the goal of SQL is to extract data from a large number of rows so you always get the right result. Whenever you store a result into a new table and the original data change, then your stored result is outdated and, then, false.

I think it would be better to always consider a table containing ALL of the sales (including the date) to extract the number of sales per month etc. And indexes will help you making such a search fast depending on your needs.

Also remember that you do not need to do everything in an SQL statement. You can also execute an SQL statement, process the result in Xojo and then issue a series of other SQL statement to achieve your goal.

For example, you don’t need to have a table to keep track of your categories. You can get the list through an SQL statement and then, in Xojo, create one SQL statement for each category.

DaveS · August 5, 2019, 6:20pm

I would disagree, however the Category Reference table should be linked via a CategoryID, not the Category Name

this way to add or change a category description, requires updating ONE record in ONE table to affect the entire database

Stéphane_Mons · August 5, 2019, 6:32pm

@Dave S Yes you have a point! But I didn’t want to scrap the whole database

nicolás_canessa · August 9, 2019, 1:10pm

I have a table with many records… the sales_per_month table is the result of a query to get the subtotals.
the categories table has an ID column, I just haven’t put it here for simplicity’s sake
with the Scott Griffitts query I get all the records from the categories table, but only once, but what I need is all categories to be repeated for each year/month group.

i need all categories to always appear in each group, as my ultimate goal is to add six more columns to this table from other tables that are exactly the same but with different totals. in short, i want to normalize the number of rows. i want the seven tables to have exactly the same number of rows so that i can add the other columns of totals.

final expected result
year month category quantity1 quantity2 quantity3 quantity4 quantity5 quantity6
2018 12 hats 0 56 564 2 5 15
2018 12 shirts 25 33 45 2 54 25
2018 12 shoes 98 890 0 54 3 111
2018 12 umbrellas 2 478 0 357 12 321
2018 11 hats 0 91 564 32 54 555
2018 11 shirts 108 77 546 45 0 297
2018 11 shoes 12 0 654 999 0 7
2018 11 umbrellas 33 0 56 33 0 7
2018 10 hats 0 44 321 4 0 8
2018 10 shirts 59 0 789 785 0 11
2018 10 shoes 0 99 12 214 54 0
2018 10 umbrellas 9 65 41 0 39 99

https://ibb.co/f1dbTs9 --> formatted table

Craig_Boyd · August 13, 2019, 3:49am

Hi Nicols Canessa,

You can go that route, but it isn’t very flexible and you are creating a lot of work for yourself in the long term as this grows.
What happens when users want to span multiple months or cross the monthly ends / beginnings? Admittedly, the last is not common, but multiple months of reporting is very common as well as year over year analysis.
Also, it is not uncommon to have orders change in the previous month because of returns, cancellations, etc…
That sort of behavior can bleed into the sixty and even ninety day history.

IF you are reporting directly from the front end database AND the transaction frequency is not too much then you could:

Use a set of triggers and create a kind of summary table that stays in sync.
The summary table though would need to look more like:
https://ibb.co/z5xgR65
SQL is more about data than it is about layouts and formatting. I completely agree with @StphaneMons on this point.
Use Xojo to do the layout on the screen and make it pleasing to the end user’s eye.
The care you must take here is if the transaction frequency grows significantly then you could have to address trigger performance issues, but to be fair, if you aren’t don’t a lot of fancy work with the trigger and just inserting data like this it will likely take a long while for that to really become an issue.
Use a set of views to do the summation. This, like #1, will keep your totals always current since a view is just an unexecuted SQL statement waiting to be used. So every time the view is hit the SQL will execute. Depending on the DBMS you are using the SQL engine may choose to cache the results if the SQL statement is identical each time, but any change will cause the engine to choose to go and read directly from the tables. The downside to this approach is that as your transactions grow in quantity so will your performance.
Lastly, depending on your DBMS, you could use Materialized Views which is much like option #1, but can be more powerful and put off (not eliminate) some of the performance issues that you could run into. I will not go into much detail about that here since the implementation of MVs varies by platform and can quickly become a tad complex.

IF you don’t have to report from the front end database AND you can move the transactions over to a reporting database then you should check out Ralph Kimball’s work on the subject of Facts and Dimensions.
You would essentially end up with a structure like this:
https://ibb.co/3FJ2xWs
The above example is taken almost directly from “The Data Warehouse Toolkit”, Chapter 3.
Basically you use the Dimension tables (Store, Product, Date, and Payment Method) in your WHERE clause and the FACT is or should be, mostly columns that you will perform math functions on (sum, average, median, mean, etc…).

Hope that helps.
This approach affords the most flexibility, but it comes with the price of a lot of work to set it up.

nicolás_canessa · August 13, 2019, 3:21pm

I’ve solved it with this query. I thank you all for your help.

SELECT x.sale_year, x.sale_month, COALESCE(s1.quantity, 0) AS "Qty", x.cat_id FROM sale s1 RIGHT JOIN ( SELECT DISTINCT s.sale_year, s.sale_month, c.cat_id FROM sale s, category c ORDER BY s.sale_year, s.sale_month, c.cat_id ) x ON s1.sale_year = x.sale_year AND s1.sale_month = x.sale_month AND s1.sale_cat = x.cat_id ORDER BY sale_year, sale_month, cat_id;