Worker operating on large matrices

Robert_Birge · November 30, 2020, 12:41am

I would like to add a worker to my program that carries out molecular orbital calculations. The worker would be used to calculate matrix elements that populate a very large matrix that is used to calculate properties. The problem I see is that the worker will need access to a matrix of eigenvectors that is typically on the order of 200 x 200 or larger. How can I provide the worker access to this matrix?

KarenA · November 30, 2020, 12:47am

That is the biggest potential issue with helpers after the overhead to start them up… the speed of data transfer…

Unfortunately the most efficient way is not built in and is platform specific but I have been told can be done with declares or a plugin…

Shared memory would both be very fast and require less memory and CPU …

If you could overlay a memory block over a segment of shared memory (using flags/semaphores to control access) the numbers would not even have to be converted to strings and back!

I was starting to work on that last summer for a different reason, but had to drop it for another project.

-karen

Robert_Birge · November 30, 2020, 1:20am

Would it be possible to then split up the problem into four parts and have a separate helper for each part and transfer all the necessary data at the start? The overhead would be bad, but only have to do it four times.

Sam_Rowlands · November 30, 2020, 2:14am

As Karen states, shared memory is THE most optimal path. One block of memory that is accessed by the helpers and the controller.

Ideally you should be splitting the problem up into many many parts. So that you can saturate the CPU to solve the problem quicker. An 8-Core i9 CPU with Hyperthreading can run 16 threads at once. A 64-Core AMD chip can do 64 threads at once (which is why AMD Hackintoshes are the fastest Macs on the planet).

Without shared memory and depending on the volume of the data, it’s possible to pass the entire chunk to the helper on launch (I forget what the limit is) and then for the helper to pass back the processed data on completion. It is just a console app after all.

The other alternative is to create a separate data file for each worker, this of course eats into the time a worker can save, because it now has disk IO slowing it down.

You can create a RAM disk in code and write out each workers data to the RAM disk, which will improve the time for the worker to read and write it’s data. However RAM disks are slow to initiate and again, waste time on generation and still waste time on reading and writing of data.

I have provided Xojo with the source code for App Sandbox safe shared memory on the macOS, in the hopes that they’ll implement shared memory sooner as they only need to figure it out for Windows and Linux.

Christian_Schmitz · November 30, 2020, 7:21am

IPCSocket may also be quite good.
Or just transfer JSON with worker methods.
The JSON may point to database entries in a shared database or file paths with data files. Or named of shared memory object.

e.g. see SetSharedMemoryValue and GetSharedMemoryValue in FileMappingMBS class.

PS: See WindowsPipeMBS class for IPCSocket on Windows.

Kem_Tekinay · November 30, 2020, 12:02pm

Fyi, Worker uses IPCSocket for its communication with the main app.

Sam_Rowlands · November 30, 2020, 12:12pm

ohhh…

Robert_Birge · November 30, 2020, 3:36pm

Is there a mechanism I could use to read properties from the main app from inside the worker. I know I cannot change them, but can I access them by value?

Kem_Tekinay · November 30, 2020, 4:23pm

No. The Workers launch as separate (console) apps so they know nothing of the inner workings of your main app.

Check your running processes (on the Mac, that would be through Activity Monitor) while Workers are running and you’ll see what I mean.

Robert_Birge · November 30, 2020, 6:09pm

So why does the Worker use an IPCSocket to communicate with the main app? As I am beginning to understand, only strings can be used to send information back and forth.

Kem_Tekinay · November 30, 2020, 6:34pm

Right, through the Worker events. Behind the scenes it’s using IPCSocket, but that’s today. Tomorrow it might be different because that’s an implementation detail subject to change that really doesn’t affect us one way or another.

Peter_Stys · December 2, 2020, 4:07am

So now I’m starting to understand how Worker works. So there is no way for a Worker to access say a large array of data in the main app. They really are 2 separate processes with isolated memory spaces. Sending large data arrays is very inefficient. I use shared memory for this using Christian’s plugin. I created a class that handles all this in the pre-worker days and it works quite well. I just tried it with the new worker class and it also runs well, approx 6x speed increase doing some complex math on 200M doubles on an i7 with 4 physical cores.

I pass the name of the shared memory object and a byte offset to each worker instance for it to process a subrange of a large array of numbers. Robert, to read properties by the worker from the main app, or vice versa, what I do is reserve say a 100 byte header in the shared memory object, then main and all console workers can R/W into this header and communicate at will.

I am happy to share my test app with my classes that wrap the shared memory mechanism if you want. You do need MBS though.

P.

Robert_Birge · December 2, 2020, 1:59pm

Peter, thanks so much. I would really benefit from seeing your app. I have MBS so no problem there.

Beatrix_Willius · December 2, 2020, 2:43pm

I’d be interested, too.

Philippe_Schmid · December 2, 2020, 5:46pm

I am also interested. Could you ev. post it somewhere ?

Peter_Stys · December 3, 2020, 12:42am

Hi folks, here you go:

Tried it with 4B doubles on a 32GB iMac and got a 6x speed boost with all cores ablaze, and curiously, memory pressure did not rise much (and that’s with a lot of other stuff open at the same time). Impressive memory handling, tho kerneltask was very busy probably swapping/compressing/decompressing…

A few things to keep in mind and get used to:

if your app crashes, the shared memory object will persist and if you don’t store its name the only way to get rid of it (and the memory it consumes) is to reboot your machine. So it gets confusing if you rerun your app and want to do it all over: the creation of a new object may fail if you use the same name, because it already exists but only the OS knows about it.
you can try this by quitting the test app (uncheck the box at the bottom else the object is automatically destroyed), relaunch, and just open the shared memory object: it and all its data will be there again
so in my real app, I keep a list of these objects (their names) in a file in case of a crash, then clean them up if necessary
just follow the code in the buttons and you’ll get the idea of how this works: there are 2 classes, one to create a shared memory object, the other to open “views” into it (2GB max per view) at various byte offsets. This is how I divvy up the task among workers.
forgive some baggage code that is left over from extracting these classes from my main app still written in RS
some anomalies that maybe someone can help me with:

• when my workers finish, their Error event is always called with an empty string, not sure why or how to get rid of this
• I noticed that constants defined in the main app class are visible to the worker during debug, but not in the built app. I can sort of see why (because during debug everything runs under app vs built where main+console apps, the latter is separate), but I’d hoped that this should be handled at compile time and that app.kMyConstant would be available everywhere. The biggest problem is that you get no warning of this between your debug and build sessions, app.kMyConstant simply appears as a null string in the console workers (I think). So if you want a constant available everywhere (like my MBS serial # that needs to be registered in both main and worker), you have to place it in a Worker constant, not the main app.
• I’m still not clear how to create a set of workers at app launch and have them ready when needed during the life of my app to avoid the startup overhead every time. Presently in my RS app, I just launch them in app.open and they sit idle waiting for commands sent via IPC sockets.
• as an aside, there was discussion re app nap, I had a maddening issue where large batch jobs would stall overnight because the OS figured that just because the main and workers were using 800% CPU, since no one was moving the mouse and posting to facebook, nothing important must be happening. MBS has a call to avoid sleep and this will likely solve the issue of napping workers and main. I can look it up if interested.

I’m sure people will make improvements on my implementation, and I’m looking fwd to learning how to make this better from y’all, but this should be a decent starting point.

P.

Philippe_Schmid · December 3, 2020, 1:08am

Many thanks. Very informative and useful. Dropbox indicates that file was deleted, so no luck with that.

Peter_Stys · December 3, 2020, 1:10am

curious sorry about that, it is there, let’s try again:

Peter_Stys · December 3, 2020, 1:11am

i just tried the updated link and can confirm that it dnloads for me

Robert_Birge · December 3, 2020, 1:15am

This one worked. Thanks!!