Folks: I have a “Build” Windows Xojo Desktop App that loads 1,000,000 observations into a Memory block. That goes quickly. Then it loops through those observations multiple times optimizing a set of parameters. This is slow. Windows Task Manager tells me that the Build App is using less than 1% CPU, much less than Task Manager!
The crazy thing is the App runs faster in Debug mode (25% CPU) than in Build mode (1% CPU).
Any suggestions about why this is happening?
I have tried XOJO 2022R2, 2023 R1.1 and 2023 R2 - the same for all three. Default optimization.
All my #IFs add extra error checking for Debug mode. None add anything extra for Build mode.
The App is not threaded. It has many app.doevents to allow the user to intervene. No user intervention for my test runs.
It was the creeping (almost stopped) progress bar that first alerted me to the problem, as well as Debug Mode taking 2 hours running a test dataset, and Build Mode not finished after 12 hours. Then I looked at the Task Manager …
You are right. I need to construct a mini-version of my app, but hoped someone knew a reason or two why this could happen.
Luckily my clients are running the old compiled VB6 version of the App. My estimate is that the Xojo Build App should run twice as fast as compiled VB6. It did in some prototypes, but now the production Xojo code tells a different story so far.
Remove all app.doevents from the current method
Remove all calls to your progressbar in the method, plus anything else that updates the screen.
Add a form-wide, or global property of ‘percentagecomplete’
Update that property in your method to the current percentage, (where you were previously updating the progressbar, and calling doevents)
Add a timer, which repeat fires now and then… it looks at percentagecomplete, and updates the progressbar to that value when it fires.
Add a thread to the window, and in the thread’s .run event, call your method
If you use the AddUserInterfaceUpdate Method of your Thread to update the UI, instead of a Timer (which also runs in the Main Thread, as your calculation Thread does), you may save slightly more processing cycles. I am not sure about this, but it’s worth a try.
maybe use the profiler menu in the ide to see where you waste time.
show progress updates only if few seconds are past.
or build a small stop watch class for time measurement.
I’m curious, did you change the Optimization Level in the Shared Build settings to Moderate maybe? I ask because that mode sacrifices speed for built app size and only would only affect the built app.
Default optimization. Ran the profile in Debug mode. Everything was as expected. Oops! Did not notice “Using the profiler with built apps”. Yes, that will be the next step after this report.
Yes, the plan is to thread the app, refactor etc., but first I need to verify that the converted VB6 code produces the same results with Xojo as it did with VB6 for the target dataset.
BTW, converting the VB6 code to Xojo was surprisingly straightforward. Global replaces did almost everything.
My most recent test, with a smaller dataset and bypassing many methods is:
Debug Mode: 32:35:16 ( = 32 minutes)
Build Mode: 33:55:10
Commented out the app.doevents
Did not activate the GUI (ran as a background app)
No progress bar.
Again, smaller dataset and bypassing many methods.
Merely launch and run to completion. This should definitely be faster because there is no user input or screen drawing.
Build Mode: 35:48:24
Longer time! What is going on??
Is it debugging in 64bit mode but building a 32bit exe?
Without sight of the code/profiling, it’s all guesswork at this end…
Off:
Since you’re using a memoryblock.
If you access parts of the memoryblock by calculating an offset (eg offset = x * 1000 + y sort of thing),
then one good speedup is to precalculate whenever possible , such as when using nested loops.
This removes 1000 calculations, for example
for x = 0 to 999
x1000 = x*1000
for y = 0 to 999
offset = x1000 +y
If you are getting hold of basic types like an integer, then it can be faster to start with a POINTER to the memoryblock.
Instead of myblock.UINT32(2000) you use
dim myblockptr as ptr
myblockptr = myblock
myblockptr.uint32(2000)
It looks the same, but memoryblock.uint32() is a method and wastes time pushing stuff onto the stack.
Ok, so we’re talking about 2 minutes over the course of 1 million readings, or roughly 8300 samples per second. A difference like that could simply be caused by whatever else your computer was doing at the time. For instance, did your computer run a backup? Or maybe you put your app in the background? Or did an internet search while you waited?
Also, how much of the time was spent reading from disk?
Ultimately you may want to try out Workers as well for processing this much data to take better advantage of the multiple cores of your machine.
Sorry that’s 1 million samples over 32 minutes. 520 samples per second. I still think this deviation has to do with something other than the actual processing time.
Other than you saying so, I wouldn’t believe this from a customer reporting similar to me.
Everything sounds contrary. Its like suggesting you press down on the gas pedal and hearing that this makes the car run slower.
Without the (or representative example) code, we can’t tell if it is code or environment.
Do you have any other machines to test it on?
Can you publish a compiled app somewhere that we could get a timing from?
Yes, I understand your scepticism. This situation is certainly not right. It looked initially that Xojo would have no problem with this project as smaller test datasets run fine. But production-size datasets are failing.
I will try to isolate the problem, but it may be something nasty like stack or heap overflow.