Array() function using Pairs causes slow Aggressive Compile

Thanks to @Kem_Tekinay for reporting this initially.

See https://tracker.xojo.com/xojoinc/xojo/-/issues/77604

I managed to figure out the problem.

Using a large Array() of Pairs() will trigger the issue.

For example, this code:

var stations() as Pair = Array ( _
  "Abha" : 18.0,_
  "Abidjan" : 26.0,_
  [ ... 400 more lines  ... ]
  "Zanzibar City" : 26.0,_
  "Zürich" : 9.3 _
)

Will add 30+ minutes to an Aggressive compile.

Workaround: simply store the data as an array of strings, then convert each item into a pair.

3 Likes

To compose a dynamic (not static) array like Integer() with 3 numbers, the compiler must do something like:

array = CreateAnIntArray()
int = CreateInt(number)
array.Add(int)
int = CreateInt(number)
array.Add(int)
int = CreateInt(number)
array.Add(int)

To compose a dynamic array like Pair() with 3 pairs (string, double), the compiler must do something like:

array = CreateAPairArray()
varStr = CreateVariant(string)
varDouble = CreateVariant(double)
pair = CreatePair(varStr, varDouble)
array.Add(pair)
varStr = CreateVariant(string)
varDouble = CreateVariant(double)
pair = CreatePair(varStr, varDouble)
array.Add(pair)
varStr = CreateVariant(string)
varDouble = CreateVariant(double)
pair = CreatePair(varStr, varDouble)
array.Add(pair)

All this at a “macro” level here just for illustration because at machine level it explodes in many machine codes and calls.
The optimizers (size, speed, light-to-heavy-levels) will analyze low level codes, trying to figure out if they can replace some codes with others without breaking the original code. Like using CPU registers to maintain intermediate values used more than one time in a segment (for speed, may add few bytes) or noticing the repetition of some segment of code and deciding to move it to a block of code and add a return at the end, and call it where you used it (for size, a bit slower), and much more.
You opted for a very hard option for an analyzer to “study” such block and find ways of improving, and made it big. Variants adds insults to injury :rofl:
I’m very inclined to think that the optimization is at the LLVM side, and can’t be sped up by Xojo, and certainly “not a bug” as it ends, and perhaps it is an unfeasible request.

how about using

stations.Add(New Pair("Abha", 18.0))
stations.Add(New Pair("Abidjan", 26.0))

or a extends method for it?

is there a difference for Aggressive compile?

@MarkusR I tried a similar syntax:

var stations() as Pair
stations.append "Abha" : 18.0
stations.append "Abidjan" : 26.0
stations.append "Abéché" : 29.4
[ ... ]

and it has the same poor result.

To @Rick_Araujo I see the point that this may not be a Xojo bug per-se, rather an issue in the LLVM optimizer, but in this case perhaps Xojo needs a better way to initialize dictionaries?

Unfortunately Xojo’s recommended way to populate a Dictionary is using this Array-of-Pairs syntax (see here and here) which cause the slowdown.

I agree, but not sure if behind the scenes they will do a better job. But a native JSON like syntax would be superb as:

Var dict As Dictionary = {
     "aaa" : 123.89,
     "bbb" : 789.76,
     "ccc" : "some text"
  }

And same for JSONItem, but JSONItem accepting array mixed contents including arrays as

Var js As JSONItem = [
    999,
    True,
    {
        "aaa" : 123.89,
        "bbb" : 789.76,
        "ccc" : "some text",
    }
  ]

And as behind the scenes it could render as similar complex variant things ending with similar machine code, the speed would be similar. But the code would be better to write and similar to other languages.

2 Likes

More testing shows it’s not specific to Pairs or the Array() function.

However, using Variants appears to be the key.

As a test, I tired creating a new class that uses only strings and doubles like this:

class cPair
  Constructor( name as string, temp as double)
    me.name=name
    me.temp=temp

function GetStationList as cPair()
  var stations() as cPair
  stations.append new cPair("Abha", 18.0)
  stations.append new cPair("Abidjan", 26.0)
  stations.append new cPair("Abéché", 29.4)
  stations.append new cPair("Accra", 26.4)
  [ ... 400 more ... ]

I did a few variations of parameters (e.g. using unique vs. the same strings and doubles on each line) but all compiled in 2 minutes.

Compile time  kind
-----------------------------------------
120 seconds   append new cPair(x,y) // different string and double on every line
120 seconds   append new cPair("foobar", y) // same string, different double on every line
120 seconds   append new cPair("foobar", 999) // same string and double on every line
120 seconds   append new cPair() // empty constructor

However, if I change my class to use Variants:

class cPairVariant
  Constructor( name as variant, temp as variant)
    me.name=name
    me.temp=temp

Then I see the pathological behavior:

Compile time  kind
-----------------------------------------
600+ seconds   append new cPairVariant("foobar", 999) // same string and double on every line
600+ seconds   append new cPairVariant(kFoobar, k999) // same string and double on every line, using a Const in code
120  seconds   append new cPairVariant() // empty constructor

Note: 600+ means I gave up after 10 minutes

It appears to be something specific to the convert-to-variant pathway which is generating code that the LLVM compiler is having trouble with.

I’ve updated the Issue with this info.

1 Like

Yep, trying to optimize a fixed kind of value container is one thing, a box that can contain anything, managed by a complex code behind it, is a rabbit hole.

Hi Mike
Here comes my question: Does this apply, if you have an array of an class definition, there the class only have two properties corresponding to the pair?

I believe it depends on what the two properties are. Variants are slow, and Pairs use variants. If the properties are Integers, Doubles, Strings then I would expect it to not be slow to compile.

That being said, there may be more than one way to trigger this “Aggressive compile takes forever” bug…

It is more like a behavior than a bug, a bug is something not working due to some error in code and have a fix, and what’s going on is just a complex task taking time, but completing the task, to deep analyze a complex context looking for enhancements without breaking things. A bug would be “never ending” or breaking the final code. And it probably it is just at LLVM level and Xojo has no control about the results and times excepting the choices and optimization levels they choose.

I thinking of a class with two double as properties, no variants.
Thanks for the information.