Feedback crashed and maxed my CPU with lots of cefsubproc.exe

My CPU was potentially maxxed for a few hours when feedback crashed while I was away for lunch.

I came back to a hot room with my PC fans running at max speed.

Can this happen in our apps too?

Not impressed.

https://www.dropbox.com/s/ykfetipfyayhc8y/FeedbackCrash.mp4?dl=0

Thats the HTMLViewer on Windows using Webkit
Would almost seem that the spun up cefsubproc.exe isnt spun up in a way that if the parent dies the children die too
I would suppose this could happen in your own apps since I dont believe there is anything special about feedbacks use of the HTMLViewer in this regard

You take a few hours for lunch? ;p
Sorry, not funny, I know.

Sunday lunch, nice and relaxing :stuck_out_tongue:

until you return & find your machine on fire :slight_smile:

Indeed, and I’ve only just had this new CPU on warranty because the last one fried!

heh … reminds me of a Sub E10K former client bought early on and the COU’s would overheat to the pop they blew out of the CPU boards
Some faulty microcode that got triggered frequently by software we’d bought from another vendor would cause it
Sun sent engineers to investigate and found that

But holes in CPU daughter boards arent fun when the machine you bought because it would “always be up” shuts down without warning

CPU explosions do that. Go figure

Ouch, physical damage from code is never fun!

I produced a reproduction project and video, seems it can happen with our apps too.

<https://xojo.com/issue/55867>

The sub-process is orphan. Maybe its code should be changed to allow receiving the PID from the caller and checking from time to time (like 10 seconds) if the parent process still exists, if it doesn’t, it just ends itself. This way any orphan process will die in 10 secs maximum after being lost.

How’d you recover from these ? Could you use Task Manager to kill them all ?
And if it can happen to feedback it can probably happen to any app we produce.
That seems pretty serious

Exiting gracefully.

Adjust the code of cefsubproc.exe to receive the main app PID by some mean, the simplest one is a command-line parameter, like “cefsubproc.exe parentpid=726”. In the main Xojo app, get its own PID and pass it to the new cefsubproc.exe (Not userspace, Xojo internal framework task). Inside the cefsubproc code have a sporadic check of the presence of PID 726 active (any real PID, just an example), once the cefsubproc don’t find it, close whatever it’s doing and exit, because you detected that that instance of cefsubproc.exe got orphan.

I hate to resurrect an old thread, but this is a verified bug, marked as reproducible, and has shown up in my shipping windows applications built with current release of Xojo. Any chance we can bump up the priority on getting this fixed?

(https://xojo.com/issue/55867)]Feedback Case #55867

I’ve made this my top case in Feedback, but it’s serious enough of a problem I’m asking anyone else to wants it fixed to also go add it to their top cases.

[quote=453355:@Kimball Larsen]I hate to resurrect an old thread, but this is a verified bug, marked as reproducible, and has shown up in my shipping windows applications built with current release of Xojo. Any chance we can bump up the priority on getting this fixed?

(https://xojo.com/issue/55867)]Feedback Case #55867

I’ve made this my top case in Feedback, but it’s serious enough of a problem I’m asking anyone else to wants it fixed to also go add it to their top cases.[/quote]

Interestingly, we had a report of this from a customer just yesterday so I’ve signed onto that case and added a screenshot.

There’s a colon missing from the URL that you posted. Here’s the Feedback link: <https://xojo.com/issue/55867>

Are Norman and Rick related?

I was in two minds about releasing this as fixes seems to come slower if there’s a workaround. While this isn’t strictly a workaround and more of a mitigation as its been a few months now and there doesn’t seem to be any movement on this being resolved I threw this together.

Just call cefCleaner.Clean() in your app.open and it will hunt for and terminate any cefsubproc.exe’s that were created by your app.

I’ve tested this on the following but you might also want to test it and implement some error handling.

W7 32
W7 64
W8.1 32
W8.1 64
W10

If you can work it into your app so it runs at the end of your crash detection then that would be even better as the processes will remain running until you run your app again if its only in app.open. If you have it only in app.open at least you won’t end up with a silly number of runaway processes if the user keep restarting your app after crashes.

https://www.dropbox.com/s/pdbit6x5ug1iifx/cefCleaner.xojo_binary_project?dl=0

You can also call it when you want and it will terminate and clean up any htmlviewer webkits (best done when the window is closed and not instantiated)

Hope it helps.

Thank you for this workaround, Julian! I really wish it was not necessary though - this is a pretty large issue, and it really should get fixed asap.

Thanks for catching and fixing that, Tom - no idea how I messed up the original URL.

I quickly hunted for an API on Windows that would tie a child process existence to its parents
And utterly failed to find one using CreateProcess
It seems that unless you use a JOB CreateProcess (which i assume CEF is using to spawn the child exe’s) can not set up a child process that will die if the parent dies
So if the process that starts things crashes everything is orphaned - exactly what we see happening

A job however can terminate all processes that are part of it from what I can find

As far as I know a job is like a “group” where you organize the processes. If the parent dies, the children still become orphans, but now tied to this “group”. If I do recall, it’s a tool that can be used by the participants to know about each other. So, if you associate a main process and a child process into a “job”, and your child process check for the list of those processes and find yourself alone, it can end itself, as it is orphan. On the other hand, if you group 10 processes in a job, and detect something wrong out of control, you can kill “the job”, so all processes will end at once. @William Yu could investigate and know better.