Optimize filling a Dictionary

I’m converting some PureBasic projects to Xojo. In these projects I use dictionaries a lot. I’m experiencing some serious performance issues here.

I’ve made a test script to demonstrate this.

PureBasic:

EnableExplicit

NewMap test.s(1000000)

Define x.i
For x = 1 To 1000000
  test("Key " + Str(x)) = "Some value " + Str(x)
Next x

Xojo:

Dim test As New Dictionary
  
For x As Integer = 1 To 1000000
  test.Value("Key " + x.ToText) = "Some value " + x.ToText
Next x

In PureBasic total time is 1 second. In Xojo it’s more than 20 minutes (I stopped it at that time; didn’t waited longer). In the PureBasic example you see that I define a minimum slot space (I can add more without having to re-dimension it). That improves the speed a lot. Without that it took 3 minutes.

Can I do the same optimization “trick” in Xojo?

NB: The above posted code examples are just examples to demonstrate the speed difference. The’re not exact snippets from the actual project.

Both .ToText’s are slowing things down.
What does this do?

For x = 1 To 1000000 test.Value(x) = x Next x

[quote=256526:@Marco Hof]Both .ToText’s are slowing things down.
What does this do?

For x = 1 To 1000000 test.Value(x) = x Next x[/quote]

That’s much faster indeed (< 2 seconds), but in my projects I have to concatenate strings. That’s why I made the example this way.

I found it. Replacing:

x.ToText

with

Str(x)

takes 3 seconds.

Still: why is ToText so slow compared to Str()? Sometimes I just might have no choice to use ToText.

Not at my Desk right now, but i would start with making ToText only 1 time instead of 2 times.

Like:

strx = x.ToText test.Value("Key " + strx) = "Some value " + strx

Maybe even replacing

For x As Integer = 1 To 1000000

with

[code]Dim x As Integer

For x = 1 To 1000000[/code]

may also speed it up a bit?

[quote=256530:@Sascha S]Not at my Desk right now, but i would start with making ToText only 1 time instead of 2 times.

Like:

strx = x.ToText test.Value("Key " + strx) = "Some value " + strx[/quote]

I know, but this is an example. In real projects I don’t do the same convert twice.

But how can we tell you what may go wrong, when you present code that you would not write that way? :wink:

Yes, Str() is faster than .ToText.
However, the concatenating will slow it down as well.

But it’s not the dictionary itself being slow.

Because I demonstrating that the same expression is used in real life projects.

So then it will be eg. “Str(CustomerID)” instead of “Str(x)”.

Well, in case
"Key " +
and
"Some value " +
are always the same, leave them out.
That way, you can also put CustomerID in as integer without the rest.

[quote=256540:@Marco Hof]Well, in case
"Key " +
and
"Some value " +
are always the same, leave them out.
That way, you can also put CustomerID in as integer without the rest.[/quote]

It’s an example …

In real projects that static text may be different depending on some decisions. Like record 1 = “Text 1” and record 2-5 = “Text 2”.

The example was meant to demonstrate an issues caused by, as we now now, string manipulations. In that context it doesn’t matter how the static text was chosen.

Johan, I can’t reproduce your results. Your original code takes about 5 seconds here.

Are you sure you’re using this:

[code] Dim test As New Dictionary

For x As Integer = 1 To 1000000
test.Value("Key " + x.ToText) = "Some value " + x.ToText
Next x[/code]

I’m testing on in 2015R4.1 32 bits on Windows 10 in debug modus (in IDE).

I copied and pasted your code. I’m on a Mac, but otherwise, same.

If I manually modify the BinCount, I can cut that down to a little over 4 seconds. If I use the new framework Dictionary, it goes to around 7 seconds. But still nothing like what you’re seeing.

Maybe using “Using Xojo.Core” would help?

Well, I tried it again, but in my environment (same as on which PureBasic example is tested) I get the same long time.

To be sure you and I are testing with the same example, I’ve made a test project:

Test project download

You didn’t mention in your example code above that you were using the new framework Dictionary as shown in your test project, but my results are the same: 7 seconds.

The classic Dictionary is faster.

BTW, Microseconds or Ticks will give you a finer estimate of elapsed time than using a Date object.