Function calls very expensive

Thomas_Moore · July 17, 2023, 9:16pm

I have an application that is taking too long to execute, so I am trying to figure out what is eating up all the time. I tested some simple operations, and found that calling a function to multiply a number by itself and return the result takes (on an average over 20 million operations) more than 500 times the amount of time that simply executing a = a*a takes.

Specifically, I compared executing 20 such operations in a loop that cycled a million times with an “empty” loop that simply executed “a = 1.2345” 20 times. Even 20 repetitions of a = a*a is negligibly different than the empty loop when executed a million times, but the function call requires much more time.

My question is, why? A factor of 2 or 3 or even 10 I might understand (you have to put the parameter in the stack, jump to the address of the method, pop the parameter, do the operation, put the result on the stack, jump back, pop the result) but 500 times? Can anyone explain why?

Greg_O · July 17, 2023, 9:18pm

Because LLVM is very good at optimizing math operations and not good at optimizing method calls.

I would ask though, have you tried the “aggressive” optimization setting?

Thomas_Moore · July 17, 2023, 9:34pm

Yes. The results I gave were after “aggressive” optimization.

Thomas_Moore · July 17, 2023, 9:41pm

(Indeed, I thought that the “aggressive” optimization might strip out the “empty” loop altogether, since it does nothing, but it kept the loop.)

Björn_Eiríksson · July 17, 2023, 9:42pm

You can speed it a lot by setting some well chosen pragmas in your functions.

Its mostly the Xojo layer or Xojo translation into LLVM that is slow with funcoms not LLVM it self (since then C++ and Objective C would suffer from it)

Thomas_Moore · July 17, 2023, 9:48pm

Thanks for the response, Björn. But what pragmas would be relevant? Maybe “StackOverflowChecking?” Would background tasks be allowed during a function call but not in direct math?

Björn_Eiríksson · July 17, 2023, 9:55pm

StackOverflowChecking
DisableBackgroundTasks
Bounds checking…

probably some more, been a while since I looked at those.

Best if not sure is to try putting them in and re-run your test.

Tim_Hare · July 17, 2023, 10:07pm

You also have to resolve which function to call. This happens at runtime, because it may call a different function based on context. That is a non-trivial operation.

Thomas_Moore · July 17, 2023, 10:20pm

Ha! I tried disabling background tasks and aggressive compiling and it did set my “empty” loop to zero time. So I need to compare against some other kind of simple operation that the compiler does not consider trivial!

Andrew_Lambert · July 17, 2023, 10:23pm

Note that this applies to regular class methods. Module methods and shared class methods are static.

Sam_Rowlands · July 18, 2023, 12:35am

Two things.

Xojo’s function calls are expensive. It can take some serious refactoring for mission critical work, but if speed is important to you, it might be worth it, it may also be worth it to look into writing a plugin or helper with a different tool, this way you can not only get the speed benefit that other tools/languages can bring, but you can also easily use concurrency. It is shocking how easy it is to use concurrency on a Mac, without Xojo.
Compiled applications are a lot faster than when run in the IDE (regardless of compiler directives).

Ivan_Tellez · July 18, 2023, 2:31am

That is how the xojo framework is designed?

Have you heard something like:
“Don’t use lengthy functions. Ideally, a single function should carry out a single task.”

Well, You have to forgot about coding best practices when working with xojo.

It is ugly, yes, but avoiding function calls is a must in xojo. For example, I had an app with like 50 custom controls in a window, with 2 functions called when each control was created, made the window took 4 whole extra seconds to load compared with lengthy spaguetti code all crammed in the open event.

For a plugin… Yes but if you dont need to call the plugin functions a lot

Beatrix_Willius · July 18, 2023, 3:53am

I do a lot of text analysis. I found that copying strings as function parameters is costly. Now I use by ref for the parameters with longer text.

Any speed analysis starts with profiling - either with the Xojo profiler or with Instruments. Doing artificial loops isn’t very useful.

Arnaud_N · July 18, 2023, 9:03am

In one of my apps, I’m trying to track down an issue, but profiling doesn’t reveal anything.
Among other things, the app shows a countdown timer for an hour, in a window. It happens that the countdown occasionally misses seconds (23:45 then 23:43), although the timer’s code is straightforward.
I’ve tried tracking down the issue, using Xojo’s profile method, or writing my own profile method, both to no avail. All called functions, average time or total time, looked to take tiny amounts of time. I now assume the issue to be in uncontrollable areas, like the event loop, or anywhere between what can be profiled/seen from code (since nothing in code seems the culprit).
Looks odd, probably, but I had to accept that.

Björn_Eiríksson · July 18, 2023, 9:40am

Xojo strings are byref though always. All they do is REALockString and REALUnlockString (which increment and decrement reference count. (You can 100% verify that behavior in the plugin SDK).

There is no copying of the content of the string.

Beatrix_Willius · July 18, 2023, 10:01am

I was able to make about 10% speed gains by using by ref.

Christian_Schmitz · July 18, 2023, 10:03am

With byref you skip the locking probably.
Which costs time.

Björn_Eiríksson · July 18, 2023, 10:25am

There is no locking though unless its return parameter.

Christian_Schmitz · July 18, 2023, 11:50am

Well, whether the lock is made in calling code or in called code doesn’t matter.
But we see it all the time in assembly.

Public Sub test2(n as string)
  test3 n
End Sub

Such a method calls RuntimeLockString and RuntimeUnlockString to keep the string referenced. If it is called in a class, it will also do RuntimeLockObject and later RuntimeUnlockObject on the object called on.

Sam_Rowlands · July 19, 2023, 1:31am

App Nap reduces the frequency your application receives CPU time. You’ll need declares to inform the macOS that your application needs those CPU cycles as this functionality is not supported by Xojo.