Memoryblocks are slower starting with 2024r3

Aaron_Hunt · November 3, 2024, 11:01pm

Try it for yourself and see the results. If your apps are doing anything substantial with memoryblocks, you’ll notice significantly slower performance starting with 2024r3.

dim s, e, t as double
s = System.Microseconds
dim z() as memoryblock
dim j, k as integer
for j = 0 to 1000000
  dim m as new MemoryBlock(8)
  for k = 0 to 7
    m.byte(k) = j mod pow(2,k)
  next
  z.append m
next
e = System.Microseconds
t = e - s
dim r as string = format( t, "0.0000" )
dim clip as new Clipboard
clip.SetText r
MsgBox r
quit
// Mac Studio M1 results running in the IDE:
//2024r2 1311051,0000
//2024r3 1868024,4583

Note: speed of 2024r3.1 is the same as r3

Ian_Kennedy · November 4, 2024, 12:03am

Likely to do with memory management protections to allow Preemptive threads to work safely. There is currently a bug report for faster semaphores / critical sections, which hopefully will appear in an upcoming version.

https://tracker.xojo.com/xojoinc/xojo/-/issues/77649

Christian_Schmitz · November 4, 2024, 6:34am

Please use ptr to work on the menoryblock.

If you use the methods there every byte read or write is a function call with bounds checking.

Jeff_Tullin · November 4, 2024, 8:02am

I got a substantial speedup from switching to using a ptr for memoryblock manipulation some years back.

As a tip, I always ‘get’ a new ptr just before using the memoryblock
(- I think I recall issues trying to get a ptr once and keep it for the life of the app - maybe the block gets moved about)

Aaron_Hunt · November 4, 2024, 8:54am

Thanks, but unless I’ve misunderstood something, concerning the difference in speed between r3 and r2, that doesn’t matter. Using pragmas to speed it up, r3 is still slower. Add these to the code above and see for yourself:

#pragma DisableBackgroundTasks
#pragma StackOverflowChecking false
#pragma DisableBoundsChecking
#Pragma NilObjectChecking false

Jeff_Tullin · November 4, 2024, 9:35am

ptr is not a pragma.
It is possible that the method based access is slower

But is using a pointer access slower?

x = myblock.byte(100)
//versus
var memptr as ptr = myblock
x = memptr.byte(100)

kevin_g · November 4, 2024, 9:37am

I suggest you log a bug with your example.

It might be related to https://tracker.xojo.com/xojoinc/xojo/-/issues/77649 or it could be a new issue that needs investigating.

Aaron_Hunt · November 4, 2024, 9:43am

Thanks, I’ve already reported it to Xojo in an existing bug report.

Aaron_Hunt · November 4, 2024, 9:57am

Using Ptr is slower in all cases (so the advice to use it appears to be fundamentally wrong). Here are the numbers on my machine:

//2024r2 Memoryblock: 1311051,0000 … Ptr: 1334671,2083
//2024r3 Memoryblock: 1868628,1250 … Ptr: 1951363,6250

Not surprising, since using Ptr adds a step (to get the pointer!)

kevin_g · November 4, 2024, 10:01am

Are you assigning the ptr variable outside of the k loop?

Aaron_Hunt · November 4, 2024, 10:01am

Of course!

Aaron_Hunt · November 4, 2024, 10:12am

You can remove the code doing anything with the memoryblocks. Just create them. r3 is slower.

dim s, e, t as double
s = System.Microseconds
dim z() as memoryblock
dim j, k as integer
for j = 0 to 1000000
  dim m as new MemoryBlock(8)
  z.append m
next
e = System.Microseconds
t = e - s
dim r as string = format( t, "0.0000" )
dim clip as new Clipboard
clip.SetText r
MsgBox r
quit

// 2024r3 356887,2500
// 2024r2 234973,2917

Aaron_Hunt · November 4, 2024, 10:15am

Forget about making an array out of the memoryblocks. r3 is slower.

dim s, e, t as double
s = System.Microseconds
for j as integer = 0 to 1000000
  dim m as new MemoryBlock(8)
next
e = System.Microseconds
t = e - s
dim r as string = format( t, "0.0000" )
dim clip as new Clipboard
clip.SetText r
MsgBox r
quit

// 2024r2 = 212367,6667
// 2024r3 = 300177,5833

Aaron_Hunt · November 4, 2024, 10:26am

Here’s the real kicker … remove the memoryblock! r3 is slower!

dim s, e, t as double
s = System.Microseconds
for j as integer = 0 to 1000000
  'dim m as new MemoryBlock(8)
next
e = System.Microseconds
t = e - s
dim r as string = format( t, "0.0000" )
dim clip as new Clipboard
clip.SetText r
MsgBox r
quit

// 2024r2 = 57570,9583
// 2024r3 = 84383,8333

So this is starting to make sense. No matter what the project is doing, it’s going to be slower in r3. When I run any project in 2024r2, it’s noticeably faster than it is in r3. Nobody else has noticed this? I find that hard to believe.

Jeff_Tullin · November 4, 2024, 10:40am

Ok.
So r3 is slower in general.
Thats worthy of note, given that we got a pretty good speed hike a couple of years back.

Kem_Tekinay · November 4, 2024, 1:27pm

I hadn’t measured it, but was told there would be a performance trade-off for implementing preemptive threads. I thought that was announced, but maybe not.

But given recent developments, I’m hopeful that this will be addressed in the next release. I’m not sure if that means we’ll be back to pre-r3 performance, but it sounds like we’d be closer.

Jeff_Tullin · November 4, 2024, 1:55pm

As someone who won’t ever be using preemptive threads, I’d like to think there is a switch to allow ‘old way without’ versus ‘new way’
I would be a bit miffed to find everyone’s ‘standard’ performance nobbled to serve a few folk who need the threads.

Tim_Parnell · November 4, 2024, 1:56pm

That’s exactly what happened.

TimStreater · November 4, 2024, 2:07pm

My app uses threads a lot, but since most of the time they are waiting on I/O, I don’t imagine that switching to pre-emptive threads will make a scrap of difference. I might try it though, just for a larf.

Aaron_Hunt · November 4, 2024, 2:19pm

Ironically, I’m one of those people who has been waiting for preemptive threads, but I haven’t been able to use them, or more exactly: when I use a preemptive thread, the performance is worse, not better, and when I don’t use a preemptive thread, but I’m using the tool which offers them, the performance is also worse than before.

I agree completely that the whole boat shouldn’t slow down just because these things are available but not used. But I also understand there are design issues which may mean that’s simply not possible.

The p-threads are a new feature. Some of us have waited patiently for around 20 years to have them. We can wait a little longer for them to work as we want them to. Or maybe they just won’t. I do hope they won’t be responsible for making the tool generally worse. It seems that is what has happened so far, save those who report huge speed increases for certain tasks done using the p-threads.