SQL question

  1. 2 weeks ago
  2. nicolás c

    Aug 2 Pre-Release Testers, Xojo Pro argentina
    Edited 2 weeks ago

    i have this table:

    Table name→ sales_per_month			
    year	month	category    quantity
    2018	12	        shirts	      25
    2018	12	        shoes	      98
    2018	12	        umbrellas    2
    2018	11	        shirts	      108
    2018	11	        shoes	      12
    2018	11	        umbrellas    33
    2018	10	        shirts	      59
    2018	10	        umbrellas     9

    and there is another table:

    Table name→ categories
    hats
    shirts
    shoes
    umbrellas

    what i need is to show ALL categories regardless of the quantity. if there have been no sales in a certain category, I also want to show a record where the quantity is zero

    The result I want to get is like the following:

    expected result				
    year	month	category    quantity	
    2018	12	        hats	      0	        *  <- record i want to add
    2018	12	        shirts	      25	
    2018	12	        shoes	      98	
    2018	12	        umbrellas    2	
    2018	11	        hats	      0	        * <- record i want to add
    2018	11	        shirts	      108	
    2018	11	        shoes	      12	
    2018	11	        umbrellas    33	
    2018	10	        hats	      0	        * <- record i want to add
    2018	10	        shirts	      59	
    2018	10	        shoes	      0	        * <- record i want to add
    2018	10	        umbrellas    9	
  3. Dave S

    Aug 2 San Diego, California USA
    SELECT a.category, b.year,b.month,ifnull(b.quantity,0) as quantity
    from categories
    left join sales on a.category=b.categoy
  4. nicolás c

    Aug 2 Pre-Release Testers, Xojo Pro argentina

    thanks dave for the quick reply. i've tried that query but it doesn't produce the desired result. it only adds the missing categories once, but what i need is to add the missing categories for each group (year, month)

  5. Scott G

    Aug 2 Pre-Release Testers, Xojo Pro

    I think this matches your expected result:

    select b.year, b.month, a.category, coalesce(c.quantity, 0) as quantity
    from categories a
    cross join sales_per_month b
    left join sales_per_month c on c.year = b.year and c.month = b.month and c.category = a.category
    group by b.year, b.month, a.category
    order by b.year desc, b.month desc, a.category
  6. Maximilian T

    Aug 2 Pre-Release Testers, Xojo Pro Europe, Germany, Berlin

    The conventional approach to generate these report-like views is to generate the dates so that you can join them with your sales data. Are you using SQLite? then see: https://www.geekytidbits.com/generate-date-range-sqlite/ . In Postgres you'd use generate_series (as mentioned in the article, select i::date from generate_series('2019-01-29',
    '2019-07-03', '1 day'::interval) i for example does the trick). Or just create a table that holds all of the dates (or year and month numbers if that's all you have) that you're interested in and then left join all of the data.

  7. Jeff T

    Aug 2 Pre-Release Testers Midlands of England, Europe

    You could also simply add rows to sales per month for the missing records, with a value of 0
    It doesn't scale to a database of several thousand product lines, but if the numbers are small...

    Simply, at the start of each month you add one row for each product with a value of 0
    Then proceed as normal, and use your current simple query.

  8. nicolás c

    Aug 5 Pre-Release Testers, Xojo Pro argentina

    thanks all for your inputs. i like jeff Tullin approach, but i don't even know where to start... how could I possibly do that?

    Scott Griffitts.. i will try this approach, thanks

  9. Jeff T

    Aug 5 Pre-Release Testers Midlands of England, Europe

    but i don't even know where to start... how could I possibly do that?

    When the app opens, check the year and month
    If there are no rows in the database with the current row and month,
    insert a row of 0 value for every item in the stock list.

    eg if it is 2019 and month 11:

    select count(1) from sales_per_month where year = 2019 and month = 11

    If the returned value is zero then:

    insert into sales_per_month (year,month,category, quantity) select 2019,11,category, 0 from categories

  10. @Nicolas Canessa — I think there is something fundamentally wrong in your database design, i.e. you should not have tables summarizing the contents of other tables (unless really necessary) because it is at best redundant, at worst false. Let me explain: the goal of SQL is to extract data from a large number of rows so you always get the right result. Whenever you store a result into a new table and the original data change, then your stored result is outdated and, then, false.

    I think it would be better to always consider a table containing ALL of the sales (including the date) to extract the number of sales per month etc. And indexes will help you making such a search fast depending on your needs.

    Also remember that you do not need to do everything in an SQL statement. You can also execute an SQL statement, process the result in Xojo and then issue a series of other SQL statement to achieve your goal.

    For example, you don't need to have a table to keep track of your categories. You can get the list through an SQL statement and then, in Xojo, create one SQL statement for each category.

  11. Dave S

    Aug 5 San Diego, California USA
    Edited 2 weeks ago

    @Stéphane ;Mons For example, you don't need to have a table to keep track of your categories. You can get the list through an SQL statement and then, in Xojo, create one SQL statement for each category.

    I would disagree, however the Category Reference table should be linked via a CategoryID, not the Category Name

    this way to add or change a category description, requires updating ONE record in ONE table to affect the entire database

  12. @Dave Sisemore — Yes you have a point! But I didn't want to scrap the whole database :-)

  13. last week

    nicolás c

    Aug 9 Pre-Release Testers, Xojo Pro argentina
    Edited last week

    I have a table with many records... the sales_per_month table is the result of a query to get the subtotals.
    the categories table has an ID column, I just haven't put it here for simplicity's sake
    with the Scott Griffitts query I get all the records from the categories table, but only once, but what I need is all categories to be repeated for each year/month group.

    i need all categories to always appear in each group, as my ultimate goal is to add six more columns to this table from other tables that are exactly the same but with different totals. in short, i want to normalize the number of rows. i want the seven tables to have exactly the same number of rows so that i can add the other columns of totals.

    final expected result
    year month category quantity1 quantity2 quantity3 quantity4 quantity5 quantity6
    2018 12 hats 0 56 564 2 5 15
    2018 12 shirts 25 33 45 2 54 25
    2018 12 shoes 98 890 0 54 3 111
    2018 12 umbrellas 2 478 0 357 12 321
    2018 11 hats 0 91 564 32 54 555
    2018 11 shirts 108 77 546 45 0 297
    2018 11 shoes 12 0 654 999 0 7
    2018 11 umbrellas 33 0 56 33 0 7
    2018 10 hats 0 44 321 4 0 8
    2018 10 shirts 59 0 789 785 0 11
    2018 10 shoes 0 99 12 214 54 0
    2018 10 umbrellas 9 65 41 0 39 99

    https://ibb.co/f1dbTs9 --> formatted table

  14. 7 days ago

    Craig B

    Aug 12 Pre-Release Testers, Xojo Pro Dallas, TX
    Edited 7 days ago

    Hi Nicolás Canessa,

    You can go that route, but it isn't very flexible and you are creating a lot of work for yourself in the long term as this grows.
    What happens when users want to span multiple months or cross the monthly ends / beginnings? Admittedly, the last is not common, but multiple months of reporting is very common as well as year over year analysis.
    Also, it is not uncommon to have orders change in the previous month because of returns, cancellations, etc...
    That sort of behavior can bleed into the sixty and even ninety day history.

    IF you are reporting directly from the front end database AND the transaction frequency is not too much then you could:
    1. Use a set of triggers and create a kind of summary table that stays in sync.
    The summary table though would need to look more like:
    https://ibb.co/z5xgR65
    SQL is more about data than it is about layouts and formatting. I completely agree with @StéphaneMons on this point.
    Use Xojo to do the layout on the screen and make it pleasing to the end user's eye.
    The care you must take here is if the transaction frequency grows significantly then you could have to address trigger performance issues, but to be fair, if you aren't don't a lot of fancy work with the trigger and just inserting data like this it will likely take a long while for that to really become an issue.

    2. Use a set of views to do the summation. This, like #1, will keep your totals always current since a view is just an unexecuted SQL statement waiting to be used. So every time the view is hit the SQL will execute. Depending on the DBMS you are using the SQL engine may choose to cache the results if the SQL statement is identical each time, but any change will cause the engine to choose to go and read directly from the tables. The downside to this approach is that as your transactions grow in quantity so will your performance.

    3. Lastly, depending on your DBMS, you could use Materialized Views which is much like option #1, but can be more powerful and put off (not eliminate) some of the performance issues that you could run into. I will not go into much detail about that here since the implementation of MVs varies by platform and can quickly become a tad complex.

    IF you don't have to report from the front end database AND you can move the transactions over to a reporting database then you should check out Ralph Kimball's work on the subject of Facts and Dimensions.
    You would essentially end up with a structure like this:
    https://ibb.co/3FJ2xWs
    The above example is taken almost directly from "The Data Warehouse Toolkit", Chapter 3.
    Basically you use the Dimension tables (Store, Product, Date, and Payment Method) in your WHERE clause and the FACT is or should be, mostly columns that you will perform math functions on (sum, average, median, mean, etc...).

    Hope that helps.
    This approach affords the most flexibility, but it comes with the price of a lot of work to set it up.

  15. 6 days ago

    nicolás c

    Aug 13 Pre-Release Testers, Xojo Pro argentina

    I've solved it with this query. I thank you all for your help.

    SELECT x.sale_year, x.sale_month, COALESCE(s1.quantity, 0) AS "Qty",  x.cat_id
    FROM sale s1
    RIGHT JOIN
    (
      SELECT DISTINCT s.sale_year, s.sale_month, c.cat_id
      FROM sale s, category c
      ORDER BY s.sale_year, s.sale_month, c.cat_id
    ) x
    ON  s1.sale_year = x.sale_year
    AND s1.sale_month =  x.sale_month
    AND s1.sale_cat = x.cat_id
    ORDER BY sale_year, sale_month, cat_id;

or Sign Up to reply!