Faster with more threads?

Brian_O_Brien · October 26, 2017, 9:35pm

I have a process that loops through half a million lines of text and builds up a database out of the contents.

dim mylog(-1) as string mylog = getalllines for i=0 to mylog.ubound processline(mylog, i) next i

So given that this seems to be one thread doing the work.
Would things go faster if I setup 4 threads to do this and gave each thread a list of it’s own to process?

DaveS · October 26, 2017, 9:44pm

my understanding (and someone correct me if I’m wrong), is that Xojo apps use only a SINGLE core of the CPU regardless…
and if you have ONE (ie “main”) thread it can use up to 100% of that single core’s power.
and if you have multiple threads… only one of them (including main" is Active at one time… so there is no “parallel” processiing going on. you are simply dividing that one core… and Xojo switches back and forth between all the thread(s) in a linear fashion,

so if you had 1000 tasks the length of time to do them would be somewhat the same…with or without threads…

Daniel_Wilson · October 26, 2017, 9:45pm

I think it would be slower… You can run the threads in helper apps.

Xojo uses a single processor per application instance. Multi-threading a single app will probably slow it down due to time slicing.

I attended a webinar or watched a video on this topic. It suggested spinning up multiple helper apps and share the load between them. This is the only way to get Xojo to use multiple cores to achieve a single task.

DaveS · October 26, 2017, 9:49pm

to MAYBE use multiple cores…
All you do it move the load balancing from Xojo to the OS… at which point every other app in the system (including operating system tasks) enter the equation, and the OS may decide to NOT split the helpers across multiple cores… of course the more cores available the higher the likelyhood

KevinW · October 26, 2017, 9:50pm

Have you tried to profile the code to see which things are using time?

KevinW · October 26, 2017, 9:53pm

I did not post that twice. It just showed up in my browser like I did though. Weird.

anon20074439 · October 26, 2017, 10:08pm

As has been said already, you lot are nice and fast

https://documentation.xojo.com/index.php/Thread

The more context switches you have away from your loop the longer it will take to finish.

Ideally, you would check the number of cores the CPU has (n), split your data set into n parts (or n-1 to make sure the UI & OS has some free cpu cycles) and spawn n worker console apps to work on the data.

I don’t know if its a design limitation of the underlying framework or Xojo just don’t want to have the headache of in app multi-threading problems/questions but it would be really nice if it had it in.

Brian_O_Brien · October 26, 2017, 10:12pm

I’m thinking that if I create a few async shells… maybe that would be a better ‘thread’.

Daniel_Wilson · October 26, 2017, 10:53pm

I did attempt async shells long ago to solve this very issue… It was pretty hard, probably because of my specific use case. In the end I used helper apps. Let us know if it works for you!

Norman_P · October 26, 2017, 11:06pm

helpers have a few nice characteristics

each one is a separate process so each gets its own memory space and can be scheduled by the OS to run on whatever cores are available
since its a discrete application you can debug it stand alone (until you try to debug preemptive thread bugs you have no idea how much this is worth its weight in gold)
since it IS a separate process you dont have to worry about all the weird and wonderful race conditions and other fun that preemptive threads bring to the table
this is something you can actually do since you dont need the framework to be thread safe etc (which its not)

There are some downsides that mostly have to do with managing them, communicating with them and passing data back and forth since they cant reach into the main app and read its properties etc (which IS a good thing)

Sam_Rowlands · October 27, 2017, 1:38am

My question would be, how about the database? Can it allow multiple apps on the same machine to connect at once and be writing all that data?

Tim_Jones · October 27, 2017, 1:55am

That depends on the database. Postgres and Maria/MySQL are very capable of handling 1,000’s of concurrent connections.

Beatrix_Willius · October 27, 2017, 4:34am

Speed improvements are more of an art. Do you know how to use Instruments? It has a profiler that is way better than the Xojo one. In the past I have been able to speed up code with very simple changes. The profiler can’t always help and you know your code best.

Helper apps are a pain. My app now has 2 and I need to work on the 3 third. Which will be for a situation a bit like yours.

While the app is writing data for n to the database the helper app will fetch data for n+1. In my first tests this was slower than doing everything synchronously. I was using xml to get the data from the helper app and Xojo was taking too long to pick the xml apart. Then I stumbled on a cool function called shared memory where you put data into a variable in one app and the second app can access the same data. This is a lot of work and you need to evaluate what will help you in your situation.

AFAIK in one of the next Xojo conferences Thomas Tempelmann is going to present something on helper apps.

kevin_g · October 27, 2017, 10:22am

[quote=356383:@Norman Palardy]helpers have a few nice characteristics

each one is a separate process so each gets its own memory space and can be scheduled by the OS to run on whatever cores are available
since its a discrete application you can debug it stand alone (until you try to debug preemptive thread bugs you have no idea how much this is worth its weight in gold)
since it IS a separate process you dont have to worry about all the weird and wonderful race conditions and other fun that preemptive threads bring to the table
this is something you can actually do since you dont need the framework to be thread safe etc (which its not)

There are some downsides that mostly have to do with managing them, communicating with them and passing data back and forth since they cant reach into the main app and read its properties etc (which IS a good thing)[/quote]

Helper apps work great if you want to horizontally and vertically scale the processing capabilities of your app. However, there are situations where this approach is completely over the top. For example, we sometimes run into situations where the file systems our app interacts with are very slow and a single file deletion or folder creation can take more than 10 seconds (sometimes 30+ seconds). Performing this command blocks the entire app which prevents any other type of processing from occurring at the same time. Using a helper app or even a shell to solve this adds complexity, makes cross platform development harder and just seems wrong (especially when our code is already running in a thread).
(FYI quite a lot of this problem is caused by folderitems themselves being inefficient and we got a significant performance increase by using the OS APIs directly)

What frustrates me is that we constantly hear that writing / debugging multi-threaded code is difficult and should be avoided. If we had a thread safe framework this would be our choice to make and we would have the ability to fall back to co-operative threading when we wanted to keep things simple. I know the difficulty with publishing future plans but it would be nice to have some type of statement from Geoff saying yes it will be thread safe one day or sorry, no - never. If we are going to be stuck on a single core forever then I would at least like to know that so that I can choose another tool if a project’s future concurrency requirements will never match Xojo.

As helper apps appear to be the proposed solution today, Xojo should be putting in some effort to make this approach much easier to work with. From my own experience here are things that I would like to see improved:

IDE
In our situation, 90% of the code for our GUI app and helper console app is the same. Xojo should have the ability to host different targets (Console / GUI) in a single project like you can with other development systems (<https://xojo.com/issue/38467>).
As we don’t have that ability it means having multiple projects and working with external items. Unfortunately, working with external items seems to have been neglected a bit as it is not possible to save in Text format (<https://xojo.com/issue/48105>) and there are bugs when working with encrypted classes (<https://xojo.com/issue/47726> <https://xojo.com/issue/48105>).
Runtime
Xojo should provide at minimum some good boiler plate classes and examples that show how managing processes and sharing data across them can be implemented.

Maurizio_Rossi · October 27, 2017, 12:56pm

What follows is my personal point of view and is not expressed to start some battle or “global war”.

Concurrent multithreading, i.e. not cooperative, can be done in a Xojo app with dlls written using other tools.
Of course this has many limits and constraints like, for example, portability and the use of other frameworks other than Xojo.

What I personally find very “old way” is that we have multicore, 64 bits processors, 64 bits framework and soon 64 IDE but we must create processes like it was 30 years ago on UNIX.
Multicore is everywhere (desktop, phones…) and is also very cheap like Raspberry.

The use of critical sections and semaphores is something that Xojo users can’t avoid when referring to something shared and concurrently modified.
Critical sections and semaphores ARE parts of the Xojo framework.
The missing part is the gain that can be obtained by a concurrent scheduling system.

Sooner or later Xojo will need to address the complications imposed by cooperative scheduling that are more than the problems that this approach is trying to avoid.

Brian_O_Brien · October 27, 2017, 4:19pm

I’m thinking that the SQL server is going to be a very busy camper handling all those requests…