MBSParseDate speed

I noticed the latest MBS release has a new ParseDate function. Has anyone done a speed test with it yet?

One of the critical performance paths in my app converts hundreds of thousands of date strings into date objects and I’m intrigued by the possibility that MBSParseDate might be helpful in speeding it up. Obviously I need to do my own benchmark, but was curious if anyone has any thoughts on expected relative speed compared with this Xojo code.

//Syslog date format: 2015-02-07T07:47:04.872175+00:00 DateStr = ReplaceB (DateStr, "T", " ") dim d as new date d.GMTOffset = 0.0 d.SQLDateTime = DateStr Timestamp = d

One improvement might come from being able to treat that “T” as a field separator and not require creating a whole new string so that Xojo’s date.SQLdatetime can handle it with a space instead.

Well, please try it.

Christian

Thanks for the very insightful info…

Cheers,
Joe

Well, a quick test here shows the plugin is factor 10 slower.
Seems to be due all the SDK calls to set date object.

My next part of my review about frameworks I will have to ask: which of the official frameworks and several plugin vendors should you use for any given task?

At what point does Xojo say “Christian how did you do that?”

Some things can be done in Xojo itself.
Some things need declares.
Some things need plugins.

And for some things, you can only use my plugins :slight_smile:

We know how to do all the same things
Wrapping it into a cross platform API is a lot harder though

For instance, HiDPI support isn’t JUST an OS X thing and how its handled on Windows Linux & OS X is quite different and requires very different support on each

Exposing the API’s is the easy part

Don’t take my comment as trivializing the work involved. Just saying to an outsider it will appear that some plugins contain functionality or improvements that should rightfully already be in Xojo.

Christian has managed to package his ParseDateMBS plugin for all three platforms and claims its 10x faster. That’s awesome. However not everyone will know that or have access to the plugin so they suffer when its now apparent it could be improved for everyone. It’s cool for one or two things but the reliance on plugins is a challenge for newcomers.

That would be interesting since the Xojo code doesn’t work AS IS - at least not in 2016r4.1

		//Syslog date format: 2015-02-07T07:47:04.872175+00:00 
		dim DateStr as string = "2015-02-07T07:47:04.872175+00:00"
		DateStr = ReplaceB (DateStr, "T", " ")
		dim d as new date
		d.GMTOffset = 0.0
		d.SQLDateTime = DateStr
		break

raises an unsupported format exception on the assignment to d.sqldatetime

Including the TZ offset as +00:00 is the problem

A generic “convert iso 8601 date format to xojo date format” is a real pain since iso allows all kinds of odd formats (like just a year)

I think you need to reread that and curb your enthusiasm :stuck_out_tongue:
He said its 10x SLOWER

[quote=321029:@Christian Schmitz]Well, a quick test here shows the plugin is factor 10 slower.
Seems to be due all the SDK calls to set date object.[/quote]

  • emphasis added -

[quote=321035:@Norman Palardy]I think you need to reread that
He said its 10x SLOWER[/quote]

Wow! I did misread that bad.

Now I am confused from MBS documentation why I would use this method.

I think what I have is plugin fatigue because MBS, Einhuger, and Chilkat all provide like 50% of the same stuff. It’s not always easy to mix and match or desirable on one project so I end up making trade offs otherwise I end up with a huge app size.

Well, I made it to support parsing with a specified format and localization.

It does a completely different job than normal ParseDate.
(and a client needed that)

[quote=321037:@Christian Schmitz]Well, I made it to support parsing with a specified format and localization.

It does a completely different job than normal ParseDate.
(and a client needed that)[/quote]

Yeah I see the several formatting options - I get it now. I’ve been able to write parsers for dates when I encountered an unusual format but in other environments the built in parsers were quite powerful.

Norman… FYI The CODE is a copy/paste from my project that does run in the current Xojo, but yes there are a couple of preceding operations that massage the raw input string that’s shown in the COMMENT.

The source date strings are in GMT, and I have a popup menu for selecting the desired display timezone. I need a fast way to parse the source dates, apply a different GMT offset and get a date/time string back for display.

Currently I parse the source strings once and keep references to about a million individual date objects. Applying a user selected GMT offset and getting a new display string is fast enough for the subset of dates that get displayed at any given time. But parsing the original date strings is time consuming. Not sure if it’s related to creating a million date objects or the actual parsing of a million date strings.

I’ve considered creating a wrapper for the date class with ONE static date object that does the parsing and stores totalseconds in the class instances. This still would create a million new class instances but would avoid creating a million Xojo.date instances and maybe even avoid a million of some sort of OS.date object too.

I have code that handles conversion of a format that is VERY close to this one
I love ISO 8601 format as there are so many “standards formats” it allows

Public Function FromISO8601Date(t as text) as xojo.Core.Date
  // these should be formed as as ISO 8601 format
  // <date>T<time><TZ>
  // where
  //  <dates> are stated as YYYY-MM-DD
  //  <times> are stated as HH:MM:SS.NNNN
  //          HH is hours (NOT 24 hour format)
  //          MM is minutes
  //          SS is seconds
  //          NNNN is nanoseconds (may be more than 4 digits long)
  //  <TZ> is formed as
  //         Z if the TZ is GMT (offset = 0)
  //         -hhmm if the tz is GMT- (offset < 0)
  //         +hhmm if the tz is GMT+ (offset > 0) 
  //             hh is hours before GMT
  //             mm is minutes before GMT
  //       so we can handle offsets that are not whole or half hours
  
  //               1         2         3
  //     01234567890123456789012345678901234567890
  // ie/ YYYY-MM-DDTHH:MM:SS.NNNNZ
  //               1         2         3
  //     01234567890123456789012345678901234567890
  //     YYYY-MM-DDTHH:MM:SS.NNNN+hhmm
  //               1         2         3
  //     01234567890123456789012345678901234567890
  //     YYYY-MM-DDTHH:MM:SS.NNNN-hhmm
  
  
  dim year, month, day as integer
  dim hour, minutes, seconds, nanos as integer
  dim gmthrs, gmtmins as integer
  dim tmp as text
  
  // everything except a few spots has to be a number
  
  dim digits as text = "0123456789"
  for i as integer = 0 to t.length()-1
    if i = 4 then continue
    if i = 7 then continue
    if i = 10 then continue
    if i = 13 then continue
    if i = 16 then continue
    if i = 19 then continue
    if i >= 20 and t.mid(i,1) = "." then continue
    if i >= 20 and t.mid(i,1) = "Z" then continue
    if i >= 20 and t.mid(i,1) = "+" then continue
    if i >= 20 and t.mid(i,1) = "-" then continue
    
    if digits.IndexOf(t.mid(i,1)) < 0 then raise new UnsupportedFormatException
  next
  
  // check for -'s in the right spots
  tmp = t.mid(4,1)
  if tmp <> "-" then raise new UnsupportedFormatException
  tmp = t.mid(7,1)
  if tmp <> "-" then raise new UnsupportedFormatException
  // check for T in the right spots
  tmp = t.mid(10,1)
  if tmp <> "T" then raise new UnsupportedFormatException
  // check for :'s in the right spots
  tmp = t.mid(13,1)
  if tmp <> ":" then raise new UnsupportedFormatException
  tmp = t.mid(16,1)
  if tmp <> ":" then raise new UnsupportedFormatException
  // check for .'s in the right spots
  tmp = t.mid(19,1)
  if tmp <> "." then raise new UnsupportedFormatException
  
  // convert data
  tmp = t.Mid(0,4)
  if tmp.Length <> 4 then raise new UnsupportedFormatException
  year = Integer.FromText(tmp)
  
  tmp = t.Mid(5,2)
  if tmp.Length <> 2 then raise new UnsupportedFormatException
  month = Integer.FromText(tmp)
  
  tmp = t.Mid(8,2)
  if tmp.Length <> 2 then raise new UnsupportedFormatException
  day = Integer.FromText(tmp)
  
  tmp = t.Mid(11,2)
  if tmp.Length <> 2 then raise new UnsupportedFormatException
  hour = Integer.FromText(tmp)
  
  tmp = t.Mid(14,2)
  if tmp.Length <> 2 then raise new UnsupportedFormatException
  minutes = Integer.FromText(tmp)
  
  tmp = t.Mid(17,2)
  if tmp.Length <> 2 then raise new UnsupportedFormatException
  seconds = Integer.FromText(tmp)
  
  // ok from here we read either to a +/-/Z
  dim i as integer = 20
  tmp = ""
  while true
    if t.mid(i,1) = "+" then exit
    if t.mid(i,1) = "-" then exit
    if t.mid(i,1) = "Z" then exit
    tmp = tmp + t.mid(i,1)
    i = i + 1
  wend
  
  nanos = Integer.FromText(tmp)
  
  // i is ON the separator whatever it was (Z + or -)
  dim mult as integer = 1
  if t.mid(i,1) = "Z" then 
    i = i + 1
    
  else
    if t.mid(i,1) = "-" then 
      mult = -1
    end if
    i = i + 1
    
    if t.length() - i <> 4 then raise new UnsupportedFormatException
    tmp = t.mid(i,2)
    if tmp.Length <> 2 then raise new UnsupportedFormatException
    gmthrs = Integer.FromText(tmp)
    i = i + 2
    
    tmp = t.mid(i,2)
    if tmp.Length <> 2 then raise new UnsupportedFormatException
    gmtmins = Integer.FromText(tmp)
    i = i + 2
    
  end if
  
  if t.length <> i then raise new UnsupportedFormatException
  
  dim gmtoffset as integer = mult * (gmthrs * 3600) + (gmtmins * 60)
  dim tz as new xojo.core.TimeZone(gmtoffset)
  
  //(year, month, day, hour, minute, seconds, nanoseconds, timezone As TimeZone)
  return new xojo.core.date(year, month, day, hour, minutes, seconds, nanos, tz)
  
End Function

As long as the +00:00 portion is removed then yes it should be pretty quick
That was the goal with the code i posted - create the object ONCE rather than create it then mutate the heck out of it like I had at the outset
Made it way faster

Norman Thanks for the code. Since the date object can parse SQLDateTime itself, is the main benefit that you support a different format or does your code have a speed advantage over SQLDateTime too?

this code returns xojo.core.date
supports a “fully qualified” 8601 date (which xojo.core.date doesn’t even with the right locale that I’m aware of)
supports using TZ offsets (which sqldate doesn’t normally)

so I think there are some advantages

its the counterpart to a “ToISO8601Date” method I also wrote which converts a xojo.core.date into a fully qualified 8601 date

Norman Yes, I do see the format flexibility benefits, I was just asking if you know if there’s also a speed benefit… sorry I wasn’t clear…

Put another way, does the Xojo.SQLDateTime have a lot of extra overhead potentially due to things like complex locale or encoding issues or OS user preference settings etc? Your code is very clear and clever, and seems like it might be very fast. But at the end of the day you still stuff everything into a Xojo.date object and I was wondering if that results in any speed hit…

Well since i dont ask it to parse any strings it avoids all the code that would parse the sql date
So that should make this reasonably quick
BUT I do do a lot of text to integer conversions so there’s a hit there

sqldate parsing IS done in C (and may be right down in an OS level API - I’ve not looked so I dont know for sure)
So it should be about as quick as it can be turning text into integers and then into a date object

I’ve not benchmarked it so I really dont know if this is faster / slower / the same
Just needed the code so there it is :slight_smile: