Can anyone figure out this SQL Query?

Assume the following table:

rowid itemid stuff     dedupe_itemid
1     111    aaaaaa    True
2     111    bbbbbb    True
3     111    cccccc    True
4     222    dddddd    False
5     222    eeeeee    False
6     333    ffffff    False
7     111    gggggg    True

We need a query that returns:

1     111    aaaaaa
4     222    dddddd
5     222    eeeeee   
6     333    ffffff

If the dedupe_itemid field is true, then we only want the row containing the first unique itemid. If the dedupe_itemid field is false, then we always want the row returned.

select 
  *
from
  stufftable as s
where
  dedupe_itemid = false 
  or not exists (
    select * 
    from stufftable as s1 
    where 
      s1.itemid = s.itemid 
      and s1.rowid < s.rowid 
      and s1.dedupe_itemid = s.dedupe_itemid
    limit 1 -- Edit to add this optimization
  )
;
1 Like

Note: if, as in your example, all 111’s will always be True, and all 222’s will always be False, this is not a great design. There should be a separate table that links the itemid to dedupe value.

(Assuming you have control over this and this represents the actual db schema.)

Thanks Kem!

Itemid 111 may not always be true. There is actually another column that is an item type that links to an item type table where we store the dedupe value. For example, itemid 111 of type A allows duplicates but item 111 of type B does not allow duplicates.

1 Like

Great.

If you got your answer, please remember to mark a “solution”.