Linux Desktop App, Crashing, Freezing, and Weird Behavior

I have a desktop application that runs very well on my own hardware (macOS, Raspberry Pi, Ubuntu VM). However, due to some constraints, I have had to use a customer provided embedded HMI display panel running an older Linux kernel.

It has been quite an uphill battle getting it all to work, but I feel like it’s almost there, just a few issues left to fix. The 2 major issues that are kinda showstoppers are a random freezing issue and a crashing issue.

The random freezing issue seemed to be happening at least once a day until I modified the logic for receiving CAN bus frames. Previously it was starting up a thread each time a message was received, now it leaves the thread paused and just resumes when the next frame is received. It also might have had something to do with a UDPSocket or TCPSocket failing to connect but I just wasn’t scientific enough with my testing and just disabled everything I didn’t absolutely need. I have not had the issue happen in about 2 days so I can’t say that it’s fixed but maybe it’s better?

When it would freeze there would be an empty message box on the screen that could not be closed. The system was in standby so not really doing anything, no user interaction. All runtime exceptions are logged (there were none) Note: The dialog is transparent and the redacted standby logo in the middle of the screen is visible through it.

The other issue is one of my windows causes a crash when I close it (either via X button or self.Close from a Button.Action event) I have many windows that close just fine but this one causes a hard crash (see stack trace below). There is nothing special about this window, just an empty ListBox, 3 buttons, and a Timer that is off. It crashes every time without fail.

The kernel is older with realtime patches, I had to manually install the libunwind library as Xojo could not find it.

I have stress tested sending 2 billion frames using an Ubuntu VM on my Mac (M1) with the realtime kernel patch and had no issues. I also tried all combinations of closing windows and could not reproduce the issue on any of my own hardware. I have had various versions of this application running well for 5+ years on RPi4 and macOS (Intel and M1/M2).

Any ideas of where to go next would be most appreciated.

Versions

$ uname -a
Linux HMI-d6eb 4.14.78-rt47 #1 SMP PREEMPT RT Wed Sep 27 11:49:42 UTC 2023 aarch64 GNU/Linux

$ xdpyinfo -version
1.3.2

$ gtk-launch --version
GTK V3.24.8

Backtrace

#0  0x0000ffffbc280a08 in raise () from /lib/libc.so.6
#1  0x0000ffffbb380a34 in g_log_structured_standard () from /usr/lib/libglib-2.0.so.0
#2  0x0000ffffbba7b918 in ?? () from /usr/lib/libgdk-3.so.0
#3  0x0000ffffbba85d7c in ?? () from /usr/lib/libgdk-3.so.0
#4  0x0000ffffbb21dc6c in _XError () from /usr/lib/libX11.so.6
#5  0x0000ffffbb21ab54 in ?? () from /usr/lib/libX11.so.6
#6  0x0000ffffbb21bd8c in _XReply () from /usr/lib/libX11.so.6
#7  0x0000ffffbb2002b0 in XGetGeometry () from /usr/lib/libX11.so.6
#8  0x0000ffff83e688ec in get_active_window_y_coord() () from /usr/lib/gtk-3.0/3.0.0/immodules/libgtk-im.so
#9  0x0000ffff83e68a78 in ?? () from /usr/lib/gtk-3.0/3.0.0/immodules/libgtk-im.so
#10 0x0000ffffbbcb1da4 in ?? () from /usr/lib/libgtk-3.so.0
#11 0x0000ffffbb484450 in g_cclosure_marshal_VOID__OBJECTv () from /usr/lib/libgobject-2.0.so.0
#12 0x0000ffffbb47efa4 in ?? () from /usr/lib/libgobject-2.0.so.0
#13 0x0000ffffbb480ce8 in ?? () from /usr/lib/libgobject-2.0.so.0
#14 0x0000ffffbb4a05c4 in g_signal_emit_valist () from /usr/lib/libgobject-2.0.so.0
#15 0x0000ffffbb4a0ad0 in g_signal_emit () from /usr/lib/libgobject-2.0.so.0
#16 0x0000ffffbbe9a460 in ?? () from /usr/lib/libgtk-3.so.0
#17 0x0000ffffbbce5508 in ?? () from /usr/lib/libgtk-3.so.0
#18 0x0000ffffbbe9a49c in ?? () from /usr/lib/libgtk-3.so.0
#19 0x0000ffffbbd35ec8 in ?? () from /usr/lib/libgtk-3.so.0
#20 0x0000ffffbbe9a49c in ?? () from /usr/lib/libgtk-3.so.0
#21 0x0000ffffbbc1c354 in ?? () from /usr/lib/libgtk-3.so.0
#22 0x0000ffffbbe9a49c in ?? () from /usr/lib/libgtk-3.so.0
#23 0x0000ffffbbe9e154 in ?? () from /usr/lib/libgtk-3.so.0
#24 0x0000ffffbbeae8a0 in gtk_widget_unparent () from /usr/lib/libgtk-3.so.0
#25 0x0000ffffbbc17e18 in ?? () from /usr/lib/libgtk-3.so.0
#26 0x0000ffffbb484450 in g_cclosure_marshal_VOID__OBJECTv () from /usr/lib/libgobject-2.0.so.0
#27 0x0000ffffbb47efa4 in ?? () from /usr/lib/libgobject-2.0.so.0
#28 0x0000ffffbb480ce8 in ?? () from /usr/lib/libgobject-2.0.so.0
#29 0x0000ffffbb4a05c4 in g_signal_emit_valist () from /usr/lib/libgobject-2.0.so.0
#30 0x0000ffffbb4a0ad0 in g_signal_emit () from /usr/lib/libgobject-2.0.so.0
#31 0x0000ffffbbc6cdf4 in gtk_container_remove () from /usr/lib/libgtk-3.so.0
#32 0x0000ffffbbea418c in ?? () from /usr/lib/libgtk-3.so.0
#33 0x0000ffffbb487f7c in g_object_run_dispose () from /usr/lib/libgobject-2.0.so.0
#34 0x0000ffffbbeb13a4 in ?? () from /usr/lib/libgtk-3.so.0
#35 0x0000ffffbbc6ed40 in ?? () from /usr/lib/libgtk-3.so.0
#36 0x0000ffffbb480a8c in g_closure_invoke () from /usr/lib/libgobject-2.0.so.0
#37 0x0000ffffbb496070 in ?? () from /usr/lib/libgobject-2.0.so.0
#38 0x0000ffffbb4a01c8 in g_signal_emit_valist () from /usr/lib/libgobject-2.0.so.0
#39 0x0000ffffbb4a0ad0 in g_signal_emit () from /usr/lib/libgobject-2.0.so.0
#40 0x0000ffffbbea4268 in ?? () from /usr/lib/libgtk-3.so.0
#41 0x0000ffffbbeb8248 in ?? () from /usr/lib/libgtk-3.so.0
#42 0x0000ffffbb487f7c in g_object_run_dispose () from /usr/lib/libgobject-2.0.so.0
#43 0x0000ffffbea2d324 in ?? () from Redacted Libs/XojoGUIFrameworkARM64.so
#44 0x0000ffffbe71bd98 in ?? () from Redacted Libs/XojoGUIFrameworkARM64.so
#45 0x0000ffffbe7172cc in ?? () from Redacted Libs/XojoGUIFrameworkARM64.so
#46 0x0000ffffbe717308 in ?? () from Redacted Libs/XojoGUIFrameworkARM64.so
#47 0x0000ffffbe9e1fa0 in ?? () from Redacted Libs/XojoGUIFrameworkARM64.so
#48 0x0000ffffbe9e1e34 in ?? () from Redacted Libs/XojoGUIFrameworkARM64.so
#49 0x0000ffffbea2d280 in ?? () from Redacted Libs/XojoGUIFrameworkARM64.so
#50 0x0000ffffbe717de0 in ?? () from Redacted Libs/XojoGUIFrameworkARM64.so
#51 0x0000ffffbe71a920 in ?? () from Redacted Libs/XojoGUIFrameworkARM64.so
#52 0x0000ffffbe71a9a0 in ?? () from Redacted Libs/XojoGUIFrameworkARM64.so
#53 0x0000ffffbea29f44 in ?? () from Redacted Libs/XojoGUIFrameworkARM64.so
#54 0x0000ffffbbef117c in ?? () from /usr/lib/libgtk-3.so.0
#55 0x0000ffffbb480ce8 in ?? () from /usr/lib/libgobject-2.0.so.0
#56 0x0000ffffbb49fa30 in g_signal_emit_valist () from /usr/lib/libgobject-2.0.so.0
#57 0x0000ffffbb4a0ad0 in g_signal_emit () from /usr/lib/libgobject-2.0.so.0
#58 0x0000ffffbbe98660 in ?? () from /usr/lib/libgtk-3.so.0
#59 0x0000ffffbbd49bf4 in gtk_main_do_event () from /usr/lib/libgtk-3.so.0
#60 0x0000ffffbba4f3b4 in ?? () from /usr/lib/libgdk-3.so.0
#61 0x0000ffffbba82a88 in ?? () from /usr/lib/libgdk-3.so.0
#62 0x0000ffffbb378fe4 in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0
#63 0x0000ffffbb379250 in ?? () from /usr/lib/libglib-2.0.so.0
#64 0x0000ffffbb3792f4 in g_main_context_iteration () from /usr/lib/libglib-2.0.so.0
#65 0x0000ffffbbd48ce0 in gtk_main_iteration_do () from /usr/lib/libgtk-3.so.0
#66 0x0000ffffbe9873d4 in ?? () from Redacted Libs/XojoGUIFrameworkARM64.so
#67 0x00000000006a5508 in DesktopApplication._CallFunctionWithExceptionHandling%%o<DesktopApplication>p ()
#68 0x0000ffffbe987230 in ?? () from Redacted Libs/XojoGUIFrameworkARM64.so
#69 0x0000ffffbe9874d0 in ?? () from Redacted Libs/XojoGUIFrameworkARM64.so
#70 0x0000ffffbe984328 in RuntimeRun () from Redacted Libs/XojoGUIFrameworkARM64.so
#71 0x000000000070d254 in REALbasic._RuntimeRun ()
#72 0x000000000134d7fc in _Main ()
#73 0x000000000134d018 in main ()

It’s not going to sleep, is it? I had an issue because you can’t (AFAIK) put a VirtualBox VM to sleep, to test what happens. Eventually I was able to find a real Windows box to test with, and found that putting the app to sleep caused a port I had open for internal communication in the app, to be closed - a major issue. If your app is gfoing to sleep, that might be something you can’t test for in a VM.

I have had issues before on macOS with AppNap/Power saving, however, the VM has been 100% with no issues. The embedded HMI is where the issues lie.

This HMI is an industrial hardened, marine certified, Linux system running a realtime kernel. You would think it would be solid.

Maybe here lies your problem. It’s not 1:1 a standard machine. Your VM is not the same thing and tests should be done in a such unusual environment before certification. Right now seems it does not pass the “compatibility test”.

It is solid hardened OS as “not being easily broken” by the user software, not “solid” as your software should work on it.

Maybe it is not solid anymore?

I have no doubt that this is where my issues lie.

It’s an older non standard kernel that was missing libraries and is not easy to test on. It’s possible that the realtime kernel has much less patience for the Xojo app dragging its feet and just kicks it out. It’s also possible that my cross compiled libuwind has some issues (I have tested it on various systems without issue though).

Had I known this was going to cause that many issues, I would have not agreed to use the customer’s hardware and used something I have proven to work. However, it’s now on a boat in the middle of nowhere 2000km away from me so I am stuck with my decision.

The hardware is capable of running a web browser and other standard applications without issue so it really seems like a me/Xojo problem and not a hardware problem.

Sure. It’s a software/OS issue. Their OS probably should not be touched, it’s patched. Upgrades just those that the vendor supplies. In case of you touching it there no guarantee of some kind of breakage or mess.

It may have some compatibility issues with those, and for what this device was made for, it wasn’t necessary, so they are not there.
Maybe even this “old system” wasn’t GTK3 ready and you “added things” to make it run?

I reference hardware as in their stuff (hardware/firmware/software) vs my/xojo stuff (xojo app)

Nothing has been done to their software/OS except for libuwind which has just been included in the Xojo app libs folder.

Well… Don’t know if you can make it work reliably there without vendor compliance analysis and their help, but I wish you all the luck to make it so if possible.

Thanks, they have never heard of Xojo so I thought I would come here to check if anyone had seen some of this behavior before.

This is a very interesting discussion. Changing the monitor systems on my boat with twin diesels is going to be my first venture in moving from Widows to Linux. I have access to all of the hardware and my boxes are conventional x64 devices so I hope I have fewer issues. I look forward to any further discussion and discoveries.

I think if you use any modern hardware like Raspberry Pi you will be ok. Most of the issues I have had (memory leaks, threads pin CPU at 100%, arm 64bit) have been fixed over the years. The only issue I have on my own hardware is wayland seems to break a lot of things, touch screens do not map across multiple monitors, windows don’t open correctly, etc. I just use X for the time being and have been good.

From Xojo requirements:

Xojo uses X11 as its backend to GTK+. Wayland will be supported in a future version. If your distro does not have X11 support installed you might be able to manually install from the Terminal:

apt-get install xserver-xorg-core

https://documentation.xojo.com/resources/system_requirements_for_current_release.html

Thanks, I did see that and swapped to x11 after I saw the weird behavior a few years ago, never tried Wayland since.

Luckily, my customer’s device doesn’t even support Wayland.

The reminder is more for Frank and other readers, as I suppose you knew that.

Ahh, yes, worth mentioning it as it did catch me up for quite a while.

My project consists of 3 industrial Arduino compatible PLC systems that have multiple modules for I/O. They are open PLC from Automation Direct. with Beelink cube Intel x64 computers, using commercial HDMI monitors. Two systems collect data from the engines and other temperatures, the other controls lighting and other equipment.

The communication is with tcpip. I have programmed webservers on the Arduino devices. I send instructions using JSON with XOJO from the pc’s to the engine plc’s to return data to log and or display. With the other system I send instructions from XOJO to the plc to turn something on or off.

It seems to me that I should be able to move that to Linux with little trouble. The only thing I am not sure of right now is that the plugin guage I use from Einhugur will work.