UUID - separate table?

For my upcoming projects I want to get rid of the auto-increment primary keys and use UUIDs. I know that using a decent UUID generation algorithm will result in a very low collision probability. However, might it make sense to have an additional table and track all used UUIDs - and check against those when adding a new record?

If your UUID field is set to unique the database will error which you can trap and generate a new UUID.

I haven’t tried this, but that seems like an excellent idea as long as you “check” your new UUID by attempting to add it to the table, then checking for an error. As long as there is none, you know you’re safe to use it for your new record.

I’m pretty sure he wants an ID that is unique across all records in every table.

The more tables you have the slower it’s going to be to check that it truly is UNIQUE across all tables unless you keep one huge master table full of just UUID’s

Just curious why you want to do this ?

Accidentally clicked on solved… sorry! can’t undo!?

That was the plan.

To avoid collisions?

Collisions in what ?
IN a single DB ?
Synching many db’s together ?

UUIDS in a single DB are unnecessary if you design it properly

It’s also about syncing across devices, merging records etc.

From a discussion on the old forum :

That is an entirely different issue - I’ve done that before BUT the merging & synching has more to do with business rules than anything. The rest is fairly mechanical.

Still doesn’t require UUID’s and a single UUID table that you constantly try to add new ones to will slow things down when you get to any reasonably large size db

I did :stuck_out_tongue:

Maybe this should be a separate conversation, but I wonder about generating UUID’s. I know there is an MBS plugin for it (isn’t there always?), and there are specific algorithms, but it seems to me that I’d just take some random bytes, add Microseconds as string, then generate a SHA512 hash on it which I’d then hex-encode. With 1.3e154 possible hashes, I’d be pretty comfortable that there would be no collisions, at least not in my lifetime. Something like this:

Function UUID() As String
  static myRandom as new Random
  dim bytes as new MemoryBlock( 64 )
  for i as integer = 0 to 63
    bytes.Byte( i ) = myRandom.InRange( 0, 255 )
  next i
  dim r as string = bytes + str( Microseconds )
  r = Crypto.SHA512( r )
  r = EncodeHex( r )
  return r
End Function

If you want something shorter, you can use SHA256 (1.16e77 combinations) or SHA1 (1.46e48 combinations). Issues?

If you’re concerned about syncing data, wouldn’t you have to worry about syncing the UUIDs, too? I mean, you could conceivably have a collision between databases. So a “master” UUID table in the database still doesn’t cut it. You’d have to implement some common mechanism that all databases query to get a UUID.

Right - thats why I say that UUID’s aren’t really required to sync DB’s.
It has more to do with designing the DB with syncing in mind so you can match up records that are “the same” and then deciding what to do if records in both db’s (or more than one) have been altered.

That requires good selection of primary keys so you CAN identify things properly and definitively match rows.

For instance if you’re synching customer data then a customer ID makes sense.
Now, can each end user of this database add new customers ?
If so then how you sync will have to resolve the possibility that I added a customer with ID 10 and someone else also may have added a different customer with that ID.
But HOW this should work really is a business driven not technology driven.

Also an Einhugur function…

I haven’t understood why yet, but the Adobe Lightroom catalog (a SQLite database) uses a lot of UUIDs. Since Lightroom is single user and doesn’t sync with other catalogs, I don’t know why they use them…

most apps that use them use it guarantee unique IDs. and there is no limit on IDs (integers have a limit based on the size of the int).

UUID help a lot for db sync, but you have also to check for other constrains (logic, time etc, Depends on what are the rules of you data)
I use standard OS functions to generate them with just a “postproduction” to let them be consistent between platforms.
The probability of a collision is very little (generated billions and never got any problem) and, more important, you should care of it at table level.

Do you care if, using integer id, you have a order with id =1 and a customer with id=1 ?

The example I used was the Adobe Lightroom Catalog. I access it with my Xojo app to retrieve data about the image. I had to hack/reverse engineer the LR Schema. It uses UUID in several places and I have no Idea why. LR catalogs don’t sync with other catalogs and with the exception of the Mobile LR app on an iPad, doesn’t interface with ANYTHING else, at least that’s documented. The UUIDs may relate to the mobile app but I don’t know that. I never looked at the LR Schema prior to there being the mobile app.