multithreading with multiple core machines

Perry_Paolantonio · October 9, 2019, 12:50pm

In reading about threading in Xojo I see that it doesn’t assign threads to different cores. Is there any way to do that? My use case is an application that needs to go through a folder that could contain tens of thousands of files, and generate a checksum of each file. When the application encounters a large file, such as a 500GB Quicktime movie, the md5 hash is very slow, and bogs down the app until it’s done. When it hits a folder of TIFF files, though, it zips right through them. I’d like to spawn an arbitrary number of threads, based on the core count of the machine at hand (it would be a user preference), then have those do a checksum and report back to the main thread. It seems I can’t do this with threading, so how would I?

What are the strategies for taking advantage of multi-core machines with Xojo, if any?

Thanks!

Louis_D · October 9, 2019, 1:06pm

Helper apps is the usual multi-core strategy. Helper apps that you spawn will generally use other cores than the main application. There is an example under console/multiprocessing.

Paul_Lefebvre · October 9, 2019, 1:11pm

You can start separate console apps on multiple cores and have each of them communicate results back to a main app. More information:

https://documentation.xojo.com/topics/threading/creating_helper_apps.html
Examples/Console/Multiprocessing/WordCounter
Examples/Console/Multiprocessing/WordCounterGUI

Perry_Paolantonio · October 9, 2019, 1:25pm

Ahh, ok. Well that certainly seems easier than dealing with threads in some ways…

Thanks

Perry_Paolantonio · October 9, 2019, 4:58pm

How would I go about creating (and then referencing) an arbitrary number of shells?

Each machine this application runs on has highly variable hardware. My little iMac is a measly 6-core i7, but we have some linux boxes with dual 14-core Xeons, and some windows machines with 12 and 14 core CPUs as well.

I’m having hard time wrapping my head around how I’d set up a user-definable number of shells and how I’d reference them. I would need to run a loop that looks for a shell that’s finished, get its result and sends it another task. Hardcoding say, 8 shells, is no big deal, but I may not need 8 in some cases, or I may want more.

Norman_Palardy · October 9, 2019, 5:10pm

you could ask the CPU for the number of cores it has and create that number of shells (or maybe a couple less)

Perry_Paolantonio · October 9, 2019, 5:19pm

That’s not really the question though. (but it’s not what I want to do here because I don’t necessarily want to max out the capability of the machine - there are times when we might be doing this in the background so we don’t want to bog it down).

The question is - how do I programmatically set up X number of shells, and then once set up, how do I reference each one?

What I’m looking to do is set up a “pool” of shells that I can call as needed until the overall job is done.

Perry_Paolantonio · October 9, 2019, 5:23pm

This isn’t what I’m asking.

Christian_Schmitz · October 9, 2019, 5:25pm

MBS Xojo Plugins come with a lot of multithreaded methods, which can help you to use multiple cores.

see blog post:
Multithreaded plugin functions can increase speed of Xojo application

Markus_Winter · October 9, 2019, 5:26pm

If you dont know how many cores you CAN use then how do you determine how many shells you should set up?

But suit yourself. Answer deleted as off the mark.

Kyryl_Pekarov · October 9, 2019, 5:27pm

Use an array of Shells

Christian_Schmitz · October 9, 2019, 5:33pm

You could use SystemInformationMBS ProcessorCount function to check the core count.

Norman_Palardy · October 9, 2019, 5:37pm

[quote=457166:@Perry Paolantonio]That’s not really the question though. (but it’s not what I want to do here because I don’t necessarily want to max out the capability of the machine - there are times when we might be doing this in the background so we don’t want to bog it down).

The question is - how do I programmatically set up X number of shells, and then once set up, how do I reference each one?

What I’m looking to do is set up a “pool” of shells that I can call as needed until the overall job is done.[/quote]
shells ? make an array of them ?
add as many as you want
they are instances of a class - the Shell class in fact - so you create as many as you want in a loop with “New Shell”

Andrew_Lambert · October 9, 2019, 5:39pm

Are you using the MD5Digest class? If not then try it; it’s better suited for processing large files than the MD5 method.

Perry_Paolantonio · October 9, 2019, 5:45pm

It’s a user preference in the application. It will vary depending on the machine. Some days you can afford to leave a machine running flat out, some days you can’t. We’ve found (using a command line tool that does what we’re building this custom app for) that this is very much dependent upon the I/O load on our SAN and on the speed of the available cores. It’s a balancing act, but letting it go full tilt by using too many cores will definitely slow the overall process down - either by bogging down the machine or overloading the disk I/O capability of the system.

Perry_Paolantonio · October 9, 2019, 5:45pm

Thanks - that never occurred to me. I’ll look into this now.

Perry_Paolantonio · October 9, 2019, 5:48pm

I am. While MD5 is the most common request from our clients, the specification we’re complying with also supports sha1, sha256 and sha512. I haven’t tried the other algorithms yet to test for speed. Is there an equivalent to md5digest for those?

Christian_Schmitz · October 9, 2019, 6:30pm

Please check our classes in MBS Xojo Encryption Plugin:

They all have multi threaded HashFile methods, which give time to other Xojo threads while hashing on a different CPU core. SO when you hash 8 files at the same time, you can get 8 cores busy!

Markus_Winter · October 9, 2019, 6:46pm

Again: how do you know how many cores there are? It might be a user preference on how many cores you WANT to use but there needs to be a maximum that you can’t and should not exceed or can a user set it for example to 100 on a 6 core machine? Because that is possible, just not recommended.

Maybe the users know what hardware they are running on and can set it accordingly - but they should not HAVE TO know - the system should tell them.

And in the future other users might not know - it is bad practise to depend on user knowledge.

But that’s just my personal opinion.

Personally I always leave at least one core free on my machine for general tasks.

Markus_Winter · October 9, 2019, 6:47pm

And yes, you should seriously look into the MBS plugins.