Function calls very expensive

I have an application that is taking too long to execute, so I am trying to figure out what is eating up all the time. I tested some simple operations, and found that calling a function to multiply a number by itself and return the result takes (on an average over 20 million operations) more than 500 times the amount of time that simply executing a = a*a takes.

Specifically, I compared executing 20 such operations in a loop that cycled a million times with an “empty” loop that simply executed “a = 1.2345” 20 times. Even 20 repetitions of a = a*a is negligibly different than the empty loop when executed a million times, but the function call requires much more time.

My question is, why? A factor of 2 or 3 or even 10 I might understand (you have to put the parameter in the stack, jump to the address of the method, pop the parameter, do the operation, put the result on the stack, jump back, pop the result) but 500 times? Can anyone explain why?

2 Likes

Because LLVM is very good at optimizing math operations and not good at optimizing method calls.

I would ask though, have you tried the “aggressive” optimization setting?

Yes. The results I gave were after “aggressive” optimization.

(Indeed, I thought that the “aggressive” optimization might strip out the “empty” loop altogether, since it does nothing, but it kept the loop.)

You can speed it a lot by setting some well chosen pragmas in your functions.

Its mostly the Xojo layer or Xojo translation into LLVM that is slow with funcoms not LLVM it self (since then C++ and Objective C would suffer from it)

3 Likes

Thanks for the response, Björn. But what pragmas would be relevant? Maybe “StackOverflowChecking?” Would background tasks be allowed during a function call but not in direct math?

StackOverflowChecking
DisableBackgroundTasks
Bounds checking…

probably some more, been a while since I looked at those.

Best if not sure is to try putting them in and re-run your test.

You also have to resolve which function to call. This happens at runtime, because it may call a different function based on context. That is a non-trivial operation.

Ha! I tried disabling background tasks and aggressive compiling and it did set my “empty” loop to zero time. So I need to compare against some other kind of simple operation that the compiler does not consider trivial!

Note that this applies to regular class methods. Module methods and shared class methods are static.

2 Likes

Two things.

  1. Xojo’s function calls are expensive. It can take some serious refactoring for mission critical work, but if speed is important to you, it might be worth it, it may also be worth it to look into writing a plugin or helper with a different tool, this way you can not only get the speed benefit that other tools/languages can bring, but you can also easily use concurrency. It is shocking how easy it is to use concurrency on a Mac, without Xojo.

  2. Compiled applications are a lot faster than when run in the IDE (regardless of compiler directives).

2 Likes

That is how the xojo framework is designed? :man_shrugging:t2:

Have you heard something like:
“Don’t use lengthy functions. Ideally, a single function should carry out a single task.”

Well, You have to forgot about coding best practices when working with xojo.

It is ugly, yes, but avoiding function calls is a must in xojo. For example, I had an app with like 50 custom controls in a window, with 2 functions called when each control was created, made the window took 4 whole extra seconds to load compared with lengthy spaguetti code all crammed in the open event.

For a plugin… Yes but if you dont need to call the plugin functions a lot :sweat_smile:

I do a lot of text analysis. I found that copying strings as function parameters is costly. Now I use by ref for the parameters with longer text.

Any speed analysis starts with profiling - either with the Xojo profiler or with Instruments. Doing artificial loops isn’t very useful.

4 Likes

In one of my apps, I’m trying to track down an issue, but profiling doesn’t reveal anything.
Among other things, the app shows a countdown timer for an hour, in a window. It happens that the countdown occasionally misses seconds (23:45 then 23:43), although the timer’s code is straightforward.
I’ve tried tracking down the issue, using Xojo’s profile method, or writing my own profile method, both to no avail. All called functions, average time or total time, looked to take tiny amounts of time. I now assume the issue to be in uncontrollable areas, like the event loop, or anywhere between what can be profiled/seen from code (since nothing in code seems the culprit).
Looks odd, probably, but I had to accept that.

Xojo strings are byref though always. All they do is REALockString and REALUnlockString (which increment and decrement reference count. (You can 100% verify that behavior in the plugin SDK).

There is no copying of the content of the string.

I was able to make about 10% speed gains by using by ref.

With byref you skip the locking probably.
Which costs time.

There is no locking though unless its return parameter.

Well, whether the lock is made in calling code or in called code doesn’t matter.
But we see it all the time in assembly.

Public Sub test2(n as string)
  test3 n
End Sub

Such a method calls RuntimeLockString and RuntimeUnlockString to keep the string referenced. If it is called in a class, it will also do RuntimeLockObject and later RuntimeUnlockObject on the object called on.

App Nap reduces the frequency your application receives CPU time. You’ll need declares to inform the macOS that your application needs those CPU cycles as this functionality is not supported by Xojo.