Message:
/home/william/libcxx-3.4/src/mutex.cpp:84: std::__1::recursive_mutex::~recursive_mutex(): Assertion `e == 0' failed.
This is Ubuntu 22.04. What kinds of things should I be hunting for to find the cause?
Message:
/home/william/libcxx-3.4/src/mutex.cpp:84: std::__1::recursive_mutex::~recursive_mutex(): Assertion `e == 0' failed.
This is Ubuntu 22.04. What kinds of things should I be hunting for to find the cause?
I guess some kind of lock/unlock unbalance.
Probably some unlock of a not acquired lock happened (at framework level?).
2025r1 seems “resource leaking”, if you are using it, maybe related?
It’s only happening on linux. I’m thinking this is a Framework bug, but hell if I’m able to reproduce it in a simple project. Here’s some output from Valgrind. The trouble is the log message at the first line of this excerpt is the very last line of the ProcessNextFax method. So this has to be related to object destruction I would think. But I’m not using any destructor methods. And even if I were, they’d show up in the stack. So it seems whatever is causing this is rendered code at the end of my method.
1741806485.085000000 ImageProcessor acc591bc-12c9-4bf9-8870-03870ffe217a Processing of fax 1e1a3aa3-c00b-4ed5-8d21-a157f06ada61 is finished.
4sfworker: /home/william/libcxx-3.4/src/mutex.cpp:84: std::__1::recursive_mutex::~recursive_mutex(): Assertion `e == 0' failed.
==363332==
==363332== Process terminating with default action of signal 6 (SIGABRT)
==363332== at 0x737D9FC: __pthread_kill_implementation (pthread_kill.c:44)
==363332== by 0x737D9FC: __pthread_kill_internal (pthread_kill.c:78)
==363332== by 0x737D9FC: pthread_kill@@GLIBC_2.34 (pthread_kill.c:89)
==363332== by 0x7329475: raise (raise.c:26)
==363332== by 0x730F7F2: abort (abort.c:79)
==363332== by 0x730F71A: __assert_fail_base.cold (assert.c:94)
==363332== by 0x7320E95: __assert_fail (assert.c:103)
==363332== by 0x7908398: std::__1::recursive_mutex::~recursive_mutex() (in /home/ubuntu/worker/4sfworker/4sfworker Libs/libc++.so.1)
==363332== by 0x4B79853: ??? (in /home/ubuntu/worker/4sfworker/4sfworker Libs/XojoConsoleFramework64.so)
==363332== by 0x4B7869B: RuntimeUnlockObject (in /home/ubuntu/worker/4sfworker/4sfworker Libs/XojoConsoleFramework64.so)
==363332== by 0xC898EA: ImageProcessor.ProcessNextFax%%o<ImageProcessor>o<CloudDatabase>s (in /home/ubuntu/worker/4sfworker/4sfworker)
==363332== by 0xC08B72: ImageProcessor.Event_Run%%o<ImageProcessor>o<Thread>o<CloudDatabase>b (in /home/ubuntu/worker/4sfworker/4sfworker)
==363332== by 0x9DAECB: SimpleFaxWorker.mThread_Run%%o<SimpleFaxWorker>o<Thread> (in /home/ubuntu/worker/4sfworker/4sfworker)
==363332== by 0x4B82974: ??? (in /home/ubuntu/worker/4sfworker/4sfworker Libs/XojoConsoleFramework64.so)
@William_Yu Since your name is in the failed assertion, any light you can shed on this?
But the framework is, unwinding things.
Make some simple test sample, William probably will pinpoint the offending part of the framework with certain ease with a very basic sample.
It really changes the meaning of what I wrote if you leave off important words.
I skipped the “if” reading, sorry.
I have more questions than answers, but at first glance, this could be a race condition.
We’ll need more details to troubleshoot:
Yes. This does not happen with cooperative threads.
No.
No. Since it happens on Linux and not Mac, I have to use a near-production server environment to reproduce it.
Probably not the answer you’re looking for but tracking down race conditions can be tricky so I’d try protecting as much as possible with critical sections or semaphores until the crashes stop and work from there to isolate the problem.
Yeah, I know. I’ve spent days trying to track this down, adding more and more critical sections, even in places that I know don’t need them. It’s been driving me nuts.
But as you wrote the reply, after commenting out most of the method and meticulously turning it back on, I narrowed it down to a single line. And sure enough, I screwed up my lock release there. Entered the critical section and returned before leaving. So @Rick_Araujo nailed it.
What I find interesting is this isn’t happening on my Mac.
Maybe the framework code for Macs does not have such assert and “forgives the unbalance”; or fires an exception you catch and also discards; or simply different times, different results…