Tips for Benchmarking Xojo

A few tips & hints about getting maximum speed out of Xojo.

Imagine a very simple 2 level for/next loop:

dim total as integer
for x as integer = 1 to 1e4
  for y as integer = 1 to 1e4
    total = total + 1
  next
next

Essentially this is 100 million repeats of “x = x +1”. It should be wicked fast on modern CPUs, right?

There are 2 factors that are important:

First, Xojo uses cooperative threading, which means that it checks to see if another thread needs time (this is called “yielding”). Xojo checks at loop boundaries (and some other cases).

This happens even if there are no other threads, and can (in some cases) waste a lot of cpu.

Second, Xojo debug runs have extra code for the app to coordinate with the IDE / debugger. It turns out this can be very slow.

Here are Xojo benchmarks on a Intel i9 MacbookPro.

Variables:

  • IDE vs. Built App
  • Cooperative threading: #pragma DisableBackgroundTasks used or not.
Results:
  Debug Run in IDE:
    Background Tasks Disabled: 100000000 iterations took 7.9 seconds
    Background Tasks Enabled: 100000000 iterations took 10.1 seconds
    Ratio = 1.28x


  Compiled app:
    Background Tasks Disabled: 100000000 iterations took 0.2 seconds
    Background Tasks Enabled: 100000000 iterations took 2.3 seconds
    Ratio = 12.51x

  Overall ratio (best vs. worse): 
     50x faster when run in a built app with DisableBackgroundTasks set.

Conclusions: there can be a roughly 5-10x speedup when running a Compiled app vs. running a Debug app in the IDE, and there is another 5-10x speedup when yielding is managed properly.

4 Likes

For fun, I built the app with “Aggressive” optimiztions turned on. Wow!

The new results:

Compiled app (aggressive optimization)
    Background Tasks Disabled: 100000000 iterations took 0.0000001320 seconds
    Background Tasks Enabled: 100000000 iterations took 2.24 seconds
    Ratio = 17043071x
Overall ratio (best vs. worse):
   76 million times faster

That’s hard to belive, so I suspect the aggressive compiler was smart enough to remove the for/next loop entirely?

4 Likes
  • With DisableBackgroundTasks you get the beachball after a second. Your users will terminate the app because they think the app has hung.
  • Tried the moderate for my app at some time. No difference at all. Doing mostly string manipulation.

If you wanted to take this a stage further, this would be ideal to test with workers.

  1. Worker test using standard Xojo functionality.
  2. Worker test using a block of shared memory to pass parameters to the workers and retrieve the results from the worker.

There is overhead in configuring and launching the “helpers”, but once you get the configuration just right you can expect the time to be reduced by ~coreCount -1

In the above case it would also show the beachball when DisableBackgroundTasks is on.
So no difference.

My main application for controlling a production plant, written in Xojo, noticeably benefits from aggressive optimisation. In my case I have quite complicated user interface, to put as many information for various aspects of production at the same time, as possible. Redrawing of all controls while resizing the window or moving custom interface splitters - works faster. Especially noticeable on Intel builds, less on M1 ones. The same goes for quite nested calculation when performing production planning. Like twice as fast when comparing to normal.

Moving heavy calculation to threads help avoiding beachball at all. Displaying the status of calculation is done via global variables set in a thread (calculation) and a timer (displaying information about progress). I hate beachball since it makes an impression of bad written software :wink:

When using DisableBackgroundTasks, it’s a good idea to (occasionally) yield. I often add somthing like this:

for x as integer = 1 to 10000000
    if x mod 10000 = 1 then  // adjust the 10000 number as needed so you are yielding enough but not too much
      app.yieldCurrentThread
      UpdateTheProgressBar() 
   end if
  [... do the computations ...]
next

Hi there,

I then to use:

Thread.YieldToNext

Rather than just Yield because if there is no other thread waiting for processor time then your thread will just continue without needing to pause.

From the Doco:

This causes the Thread Scheduler to yield the currently executing thread’s processing time to the next Thread in the queue that is awaiting processing cycles. It is possible for the next Thread to be the currently executing Thread.

Regards

TC

and does that avoid the beachball?

Yes certainly does, in my assembly engine, I might be running a long running calculation and I’ll have an occasional YieldToNext somewhere or YLD in assembly speak. Doesn’t need to be called all the time, just frequently enough to keep the GUI thread updated.