This is one concept that has eluded me for years…
Given two computers… each with database “XYZ” and each instance contains a table “ABC” with the exact same structure… easy enough
Data is entered, modified and/or deleted on each of the two computers , so now the contents of each database instance are no longer the same.
How does one determine what the correct merged result would be.
Computer A modifies record PK3, Computer B deletes it. What does a synched database have for PK3?
the same if A modifies a record, and B modifies the same record, but each modifies a differnt field
etc for all the other out of synch scenarios
I’ve seen iPad apps that have a desktop version, and they can synch … Addresses and Contacts and Calendars as example.
If you’ve ever tried sharing an Excel Workbook you’ll have had the pleasure of the mysteries of this!
(Actually that’s one of the reason’s I’m building our Office Management in Xojo and a MySQL database - Excel shared workbooks are a nightmare)
I may be wrong in some of the details here, but if you think of each Excel "Sheet’ as a DB table, when you Save a Shared (say on Dropbox) Workbook - if another user has made modifications it will ask if you want to update your Workbook with their changes. I believe these are stored in a “delta” file - ie. a file whch contains a list of changes each user has made to the Workbook. Deletions as opposed to record changes get’s complicated though which is where I’ve seen most people (self included) run into trouble with shared Excel workbooks.
I believe that’s why anything that needs to be robust uses SQL DBs as there can only ever be one operation performed on the data at a time - ‘synching’ two disparate locally stored DBs always has the potential for lines being deleted on one and therefore synching becomes fundamentally impossible to a point.
First step, ditch your integer primary keys. Use a UUID instead, they were designed to solve this problem.
Second, store a modification datetime with gmtoffset. Newest modification wins.
Third, store a table of delete actions with times. Drop records whose time is less than the modification time. Then use the remaining records to drop the linked records, and drop the delete record as you process them.
Take this with a grain of salt though. It’s been years since I’ve written a sync system, and I think at least one Xojo engineer can attest to the fact that I suck at it. But that’s the general concept.
I’ve never seen one, not that there isn’t one out there somewhere. We’ve done a DB sync in several projects. It’s always a royal pain.
One thing that we’ve learned is to never depend on the primary keys between two different databases. We use the UUID for whatever database creates the original record. Then, everyone is on the same page as figuring out which record are we really talking about. For example, if the server creates 10 records with ID’s 1-10 but only 3 go to Computer A and 7 go to Computer B then you can’t rely on the record id’s from either Computer A or B so the UUID created on the server is the only reliable way to determine unique record id’s.
After that, it’s a matter of figuring out your rules for conflicts. If Record A is modified since last sync on both computers A, B, and the server how do you sort it out? Attempt the merge but then what happens if they’ve all modified the same field? Does the server become the arbiter or do the client attempt to sort it out themselves?
Don’t delete anything - ever. Mark them as deleted. Once you delete the data you are in for a world of hurt. Did I mention don’t delete data?
We ended up writing out a fairly lengthy and involved pseudo flowchart to write code to. I suggest you do it too.
I haven’t done this yet, but it might be helpful to go look at Apple documentation for CoreData and iCloud documentation and see if you can glean any insights from it. Most iOS apps are probably using CoreData and I believe it has a Sync method built in. You’ll have to recreate it in Xojo.
Consider using the Dropbox Datastore API which is built to handle data syncing. This video from DBX 2013 provides a good overview. Even if you choose to not use their solution, it may give you some very good ideas for creating your own.
Our product (see videos at www.lightbluesoftware.com) does database syncing. It’s REALLY hard and took us months of work on paper before we even started coding.
Thom and Bob have some good tips: UUIDs, deleting and so on. But once you start getting beyond a very simple database, it gets massively more complicated. Some things to think about:
If you use timestamps to determine which is newer, beware of people working in different timezones.
If you do decide to delete records (see Bob’s comments!) then if you have the concept of parent and child records (e.g. a Contact record who’s linked to several Address records via a ContactAddressLink table) you’ll need to work out some mechanism for deleting the links (or marking them as deleted) - and, possibly, deleting the addresses as well if they’re not used by anything else.
There are always more edge cases than you think. What if you’re running on three computers and two try to sync changes to the same record at precisely the same time? Etc. etc.
It’s quite fun once you get it working but there’s some pretty intense thinking to do before you get to that stage. As Bob says, good luck!
Not sure if you were talking to me or not, John, but I’ll reply as if you were: we write software to run a photography business. Yes, the sort of thing which we’ve written is a bit like version control, but we wanted very fine-grained control on the syncing (we do it at database field level, not row level) and also to sort out all conflicts automatically (i.e. never to say ‘do you want theirs or yours’ because that way confusion lies). Many of our users are very non-techy and so we like things to be as simple as possible…
Yes it was you I addressed my question too. Cool… Yes always problems when you deal with Non-techy users… On the other hand. This is the 21st century, we should be able to make easy interfaces for users now.
Oh, we could reasonably easily make an easy interface, but it’s placing decisions on a day-to-day basis in the hands of people who aren’t sure about the answers, so we try to make it even easier than that.
Something we try to do now is ‘fix problems twice’. If someone gets in touch with a query where they can’t do something, or have a problem, we a) tell them how to fix it, and b) try to change the software so their problem can’t happen again. It massively reduces our support workload, over time. It’s a fascinating process to go through
If they are on the same network you could have the desktop app be the ‘server’ and have a TCP Socket connection between the two.
If they are not, but both will have internet connection, you might want to have a web app bridge. Both communicate to the centralized web app on a web server. Then each acts like a client to the web app. That way your programming is pretty much the same between desktop and mobile. We’ve done this with a web app using HandleSpecialURL and you’ll have to define your API so that you can sync properly.
We’ve done this with SQLite. The database really doesn’t matter much.
Do yourself a favor, include the gmt offset. Always. Even in your case, if the user were to travel back and forth between the US coasts, your timestamps can still be resolved. Without it, your timestamps created in the other time zone won’t calculate correctly.
Either include the specific offset, or do everything in GMT. You will save yourself headaches later.