Is there a way of disposing of worker processes and having them recreated?
We’re doing some extensive processing in workers and they work quite well. We also have a niggling memory leak that can accumulate over the processing of many thousands of items, causing workers RAM use to grow over time. In a server-type situation it’s worrying.
We’re working on stomping out the leak, but in the meantime it would be great if we could dispose of them periodically and have them respawn. Nothing we’ve attempted causes the workers themselves to disappear…
Since workers are objects, when you drop the last reference to one, it will be destroyed. The trick is that you may not be able to do this in the JobCompleted event. You may need to fire a short-period timer to replace this one with the next one.
Thanks Tim, but I don’t get the impression that they act that way
The project’s worker object seems to be an automatically-instantiated singleton (at least from the app side) and I can’t see any sign of references to the worker/console app instances that get created.
They’re all conveniently abstracted away from view.
I created a ‘threatID’ property that is populated worker-side and returns the ID with each jobCompleted event so I can keep an eye on whether any of them are reused or new ones are created, and no recreation occurs
We also tried setting .corePercent and .maximumCoreCount to 0, waiting a bit, and resetting them to their normal values - but the original worker instances were reused…
I may be on the completely wrong track here, but maybe it’s how you are quitting?
How about trying using flag that is set when the command is received, and check that to exit the worker processing loop so that the worker it “naturally” quits?
The framework might need that type of “natural” quit to inform the parent app it’s dying if actually calling quit was not considered in the original design.
You’re onto something there. _Quitting = true seems to have an effect but our code is fairly complicated and it’s taking some work to get it to work the we we’d hoped
I’ll report back on our results but it in the meantime I thought I’d ask about an error that’s getting thrown:
When we want a worker to ‘retire’ we return “kill” from the app-side jobRequested event. The worker watches for jobData = “kill” in its JobRun event and uses your helpful _Quitting = true along with returning “killed:”+workerID. That seems to work as expeted but the Error event fires with “kill” as its jobData and it’s tough to know why.
The dox say “If there is an error, the job data is provided here so you can determine which job caused the error.” which isn’t terribly useful as we don’t know why the event is firing (and we’re not raising it ourselves)… any clue there?
Yup, you’ve got it there. Once we removed all the other possible errors (including exceptions thrown as a result of what we were doing), it is the most likely remaining explanation.
So, we now have it working where we can command workers to quit and then have them recreated again as needed.
We found that restarting the workers using a timer (as a whole, using .start) is essential to provide time for the quitting workers to be removed, otherwise they get reused.
Thanks for your help. I created the issue to request for it to be exposed and handled better.