I have two databases that are used by two different apps. One is the master the other needs to get updates from the master. Both tables are identical.
Currently I am looping through the results which takes about 10-12 seconds to complete. Can anyone suggest some SQL code to do this on the database itself without looping? Or a better/faster method of doing the comparison?
I am using PostgreSQL, Xojo 2022 R3.2 targets are Windows and Raspberry Pi. However the databases all reside on the same computer.
This may not be what you are looking for, but this is certainly how I prefer to compare tables between databases:
Many database tools have the ability to compare tables between databases, and sometimes, have data transfer and data synchronizing tools also. With PostgreSQL, I use Navicat on a Windows computer. It does have the feature. A number of other tools that I tried and since abandoned, have the feature also. The host system does not really matter, so long as you configured postgresql to allow remote access and opened the firewall on the hosts. In the spirit of full disclosure, Navicat is not free and certainly not cheap. But for me, it is worth the money. I don’t upgrade with every new version. As I said, it is not cheap.
theSQL = "select from table2 left join table1 where table2.field1 <> table1.field1 or table2.field2 <> table1.field2 or table2.field3 <> table1.field3"
Should identify the changed records.
Then you can then loop through them and process as required.
Var oInsert as PostgreSQLPreparedStatement
Var oUpdate as PostgreSQLPreparedStatement
Var oQuery as PostgreSQLPreparedStatement
oInsert = SQLitePreparedStatement(db.Prepare("Insert into table1 values (?,?,?)"))
oUpdate = SQLitePreparedStatement(db.Prepare("update table1 set field1=?, field2=? where field3=?"))
// Search for records
While not table2->eof // loop though records
If UpdateRequred then // If we need an update
oUpdate.SQLExecute( value1, value2, value3 )
else if InsertRequired then // If we need to insert
oInsert.SQLExecute( value1, value2, value3 )
// Move to next record
Do you need to make the slave table identical to the master?
if so, empty the slave and copy all records from the master
Do you need to make only records they have in common change in slave to match master?
if so, create a temporary table at the slave end, copy the records which exist in both tables, then delete them from the slave table and insert from the temp table.
If you truly need to know which records exist in common but are different, you probably need to assemble a query like this one:
select a.*, count(1) as diffrecordcount
from master a, slave b where a.ID = b.ID and
coalesce(a.fieldname1,'X') <> coalesce(b.fieldname1,'X')
coalesce(a.fieldname2,'X') <> coalesce(b.fieldname2,'X')
I used coalesce here … some dbs use ISNULL or similar.
The comparison must handle null values and provide a default fallback for the field, because comparing NULL with a value doesn’t work. I tend to use a value that wouldnt occur for real. Like ZZZbananaZZ for a text field…
If the two should always be in sync, I recommend setting up replication and let PG handle the details. However, I’m assuming your needs are a little more complex.
Option A, add a generated column that prepares a hash of the important columns. A stored generated column will update this hash whenever the record is written. You can compare this hash between the two tables to determine if a row does not match. Can you join between the two tables? It’s not clear to me how you’re currently comparing. This option doesn’t really change much, just makes it more convenient to compare the rows rather than doing lots of column comparisons.
Option B, just insert everything and use an ON CONFLICT clause with the primary key. Something like INSERT INTO table (column) VALUES (value) ON CONFLICT (primary_key) DO UPDATE SET column = EXCLUDED.column WHERE column IS DISTINCT FROM EXCLUDED.column; should work, though I might have made a mistake since I’m typing this off the top of my head. I don’t really like this option though, because it requires a failure first then an update, which is slower than just avoiding the insert in the first place.