DateTime Math?

Hello,

I’m building an app with Xojo using MySQL that periodically will reach out to an MSSQL Server database via XML and get a list of new or changed records between two DateTime markers. The MySQL database will then be updated. It’s a low traffic proposition. This is my first time doing something like this and am wondering how best to do the math when checking if it’s time for the next polling of the MSSQL Server database? I also don’t want to be dealing with daylight savings time issues. I’m thinking that I should use the TotalSeconds? Would this be correct or do I still need to set the GMTOffset to 0 or somehow deal with daylight savings time. Not sure if that would be correct.

Since it is a low traffic database I would probably only query the MS database every 30 minutes, 1800 seconds. It seems easy enough to work that out. OTOH, It is easy enough to get the SQLDateTime but I’m not clear on how to do the math on that, adjust for daylight savings time, and then translate the SQLDateTime into the format necessary for the MS database.

I’d be happy to hear some suggestons. Thanks.

Off the wall thinking for you, something to think about.

Rather than pull from mysql - why not push from mssql? If your mssql edition supports ssis (ETL) then simplybuild an ssis package that fires every 30 minutes and pushes data to your mysql database. No XML needed. You may not have to worry about the date stuff either?

Or

Look into replication (either from mysql or mssql) to get the data.

In the former case, since precision is not a factor, why not just use a Timer?

I can only access the mssql database via an XML API. So I have to prepare an XML statement where the datetime parameters looks like this:

"<StartDate>2013-04-01T00:00:00</StartDate>" "<EndDate>2013-04-01T23:59:00</EndDate>"

I get back a list of record id’s for records that have been created or have changed between those times. Then I loop through those and deal with it.

I want to roll the datetime parameters forward such that the of one query becomes the of the next. I don’t want to overlap and get duplicated data though it really wouldn’t be a problem. A timer wouldn’t work other than to fire off the process or check to see if it’s time to process and I’ll probably use a timer just for that purpose. I still have to get the datetime on the local server which does not come in the required format as above. I build the above format as a string.

It just seems to me to be more simple to use TotalSeconds than something like SQLDateTime. I’m thinking something like this. The timer fires, a comparison is made between the system TotalSeconds and the TotalSeconds of the last enddate used. If it’s greater than 1800 seconds then off we go.

Thanks for the feedback.

I think I see. I would take the latest returned DateTime of the last response, and use that for the start time of the next query. The end time would always be something far in the future. Then I’d deal with the inevitable overlap.

You’d always be using the remote system time so the local time wouldn’t be a factor.

Or am I still missing something?

Could be great if you had something like a field to keep this state like [NeedsProcessing] (0=false, 1=true, default=1). On INSERT it will be true, on UPDATE you set it to 1 too. So you could forget using the time of the insert or the time of the update to select touched records needing processing, you could just run something like SELECT * from MYDATA WHERE NeedsProcessing = 1

And as Kem told, you run your processing batch using a Timer firing at each 1800 seconds. Then you set NeedsProcessing = 0 for all those.

Yea, this is an odd project. The client runs these events and has been using this site to manage registration, attendance, email notifications, etc. --> http://www.cvent.com

Attendees of these events can enter and/or update their records directly through the Cvent website. So I can’t tell who is doing what and I can’t flag any records entered there. All I can do is get all changed records between two times.

I can keep track of what my client’s staff does with the app I’m building but not the general public through the website. That’s the gotcha as far as I can tell.

Using a timer to fire the process is what I’m going to do but I may do it at intervals less than what I really want to account for possible system crashes.

So do you see a downside to following my advice? It seems to me that if you just use the latest DateTime returned from the last query, it doesn’t matter how often you perform your query (minutes, hours, or days), you’ll always get all the records that have changed since the last time you checked. I recommended using an end time that’s in the future because you always want all the latest changes, right?

Has the database table not a unique Number ID that is incremented by every inserted row ?

If so, you could use that as reference.

He wants the modified records too.

The status field proposed by Rick is more robust but as you can’t modify the source database you can’t do it. You can probably end up with scenarios where you pick a timeframe on your client using your client machine’s clock and even when you have converted to the same time zone as the server, differences in the clocks, resolution of the time specified and the time to execute the query, could cause records to ‘escape’ your query. In the absence of being able to use a status field as per Rick’s suggestion you could add some overlap into the time period and check or reprocess the overlapped records at the start of the time period and as Kem suggests overlap the end time too. It is a bit of a kludge because you really want to be 100% sure and that is what the status field gives you.

[quote=32359:@John Hansen]Has the database table not a unique Number ID that is incremented by every inserted row ?
If so, you could use that as reference.[/quote]
This would only work for inserts and the OP says he needs to obtain updates too.

Using your example

Those dates do not specify the time zone, you should use the same TZ in your app as the database uses. You can already see a potential bug with the timespan that you give (I appreciate it was just an example) because there are 59 seconds missing from the end of the day 23:59:59 vs. 23:59:00.

Just imagine that right now the server thinks it is 23:59:56 when your app (code or timer) thinks it is 23:59:59, you issue your query thinking that you will get all records up to the end of the day but you are actually issuing the query a few seconds too early for the server and you can miss one of more inserts/updates.

Even with the clocks perfectly aligned your query could execute in the same second that an update is occurring, your query happens during the first half of that second and the insert/update occurs in the latter half of the second, you miss the record…

With my idea, this wouldn’t happen.

Initial query is for StartDate 2013-04-01T00:00:00 and EndDate 2013-04-03T00:00:00 (always two days, or something, from the current date). The result set, presumably, will have the modification date of each of the records. This is the server date, so you sort them descending and find that the last modified record was on 2013-03-31T23:58:45. That becomes the StartDate for the next query, and the local time never makes a difference. There will be some overlap, but the OP said that was not a big deal.

Nobody has addressed this so I’m starting to think I’m missing something obvious and looking the fool…

[quote=32364:@Kem Tekinay]With my idea, this wouldn’t happen.

Initial query is for StartDate 2013-04-01T00:00:00 and EndDate 2013-04-03T00:00:00

you sort them descending and find that the last modified record was on 2013-03-31T23:58:45. That becomes the StartDate for the next query, and the local time never makes a difference.

Nobody has addressed this so I’m starting to think I’m missing something obvious and looking the fool…[/quote]

2013-03-31T23:58:45 should not be returned by the query for StartDate 2013-04-01T00:00:00 and EndDate 2013-04-03T00:00:00

The potential problem is a concurrency one. While the query executes other records can be changing and the resolution of 1 second is not enough to guarantee the expected result. That is why the status field works - you have either set the status or have not. Solutions like this often appear to work especially when developed in single user scenarios or unrealistic test environments but then ‘strange’ bugs get reported. Overlapping the start and end dates/times by a reasonable amount can improve things.

That’s what I get for writing that at 3 AM. :slight_smile: Still, let’s say the latest mod date from the returned set is 2013-04-01T01:05:10. Since the OP can’t change the source database (a status field is out of the question), and you bring up a good point about concurrency, the next StartDate could be the latest date minus a minute, 2013-04-01T01:04:10.

:slight_smile: I figured it was a typo.

I did write what you suggest a few posts back, acknowledging your idea about extending the end time and also subtracting from the start time to get an overlap.

[quote=32363:@Carl Clarke]In the absence of being able to use a status field as per Rick’s suggestion you could add some overlap into the time period and check or reprocess the overlapped records at the start of the time period and as Kem suggests overlap the end time too. It is a bit of a kludge because you really want to be 100% sure and that is what the status field gives you.
[/quote]

And that’s what I get for trying to follow a thread at 3 AM. :slight_smile:

Kem:

No, that’s very good advice. Especially using a future date.

Not that I have access to through the XML API. And yes, I need to get modified records as well.

Yes, that would work and it wouldn’t be a problem. The worst that would happen is that a couple of records that already do exist will get updated once again. And I do mean just a couple. This is very low traffic.

There’s no place in the XML API to specify a time zone. All machines where time is critical are in the same time zone, the two database servers.

But I’m curious about daylight savings time and using the TotalSeconds of the Date object with a GMT offset to 0. Would this eliminate the problem with daylight savings time? Would it be one unchanging time all year long?

I did not catch that 59 second issue, thanks.