Web App crashes: Unexpected error 9 on netlink descriptor 9

My Web App is running on Debian 12 on a LightSail server, installed with Tim’s LifeBoat. My app is depending heavily on UrlConnection because almost all it does is communicate with other servers and show the outcome in a customer facing portal. I have subclassed UrlConnection and use it for async communications, so I use delegation with ‘WeakAddressOf’ a lot to handle loading data in the fields and listboxes upon ContentReceived.

Every now and then, and hard to reproduce (and only occuring on Linux!), the app crashes with the above mentioned error I found in the log. The numbers that are shown in the errordescription vary a little bit. In nginx log I find a more detailed description of the error:

2024/11/04 13:01:36 [error] 15158#15158: *6980 connect() failed (111: Connection refused) while connecting to upstream, client: 213.34.xxx.xxx, server: demo.xxxxx.cloud, request: “POST /9041C0EF367D81556B9B8892F273BF4EDF7B07A2
D1308B799149FB94ACE65B1B/data/wjteJp/rowdata HTTP/1.1”, upstream: “http://127.0.0.1:42002/9041C0EF367D81556B9B8892F273BF4EDF7B07A2D1308B799149FB94ACE65B1B/data/wjteJp/rowdata”, host: “demo.xxxxx.cloud”, referrer: “https://demo
.xxxxx.cloud/”

demo.xxxxx.cloud is the server running my app and the IP address is my public IP address from where I was accessing the app through my browser.

At first I thought my app was creating too many connections. But also when I limit that to 1 connection per user action this error occurs.

I hope that anybody has a clue or knows a solution. Below a ldd of the loaded libraries, nothing strange there I think:

linux-vdso.so.1 (0x00007ffdc2ba3000)
XojoConsoleFramework64.so => /home/admin/.com.strawberrysw.Lifeboat/demoxxxxxcloud/xxxxx Libs/XojoConsoleFramework64.so (0x00007f32d0e00000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f32d0c1f000)
libgobject-2.0.so.0 => /lib/x86_64-linux-gnu/libgobject-2.0.so.0 (0x00007f32d0bc0000)
libglib-2.0.so.0 => /lib/x86_64-linux-gnu/libglib-2.0.so.0 (0x00007f32d0a88000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f32d3466000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f32d3461000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f32d345c000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f32d09a9000)
libunwind.so.8 => /lib/x86_64-linux-gnu/libunwind.so.8 (0x00007f32d3440000)
libunwind-x86_64.so.8 => /lib/x86_64-linux-gnu/libunwind-x86_64.so.8 (0x00007f32d098c000)
libc++.so.1 => /home/admin/.com.strawberrysw.Lifeboat/demoxxxxxcloud/xxxxx_portal Libs/libc++.so.1 (0x00007f32d0400000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f32d096c000)
/lib64/ld-linux-x86-64.so.2 (0x00007f32d3473000)
libffi.so.8 => /lib/x86_64-linux-gnu/libffi.so.8 (0x00007f32d3432000)
libpcre2-8.so.0 => /lib/x86_64-linux-gnu/libpcre2-8.so.0 (0x00007f32d08d2000)
liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f32d08a3000)

It’s worth noting the error message comes from the CURL library, not nginx. Here’s a thread from the last time this came up, but the resolution was that the OP was using a malformed URL: Http Socket Crashes WebServer

The line you’ve found in the nginx logs is quite common. I’ve never found the cause of why the Xojo Web app refuses the connection, but I’ve also never seen it have an effect on a running app. I don’t honestly think it’s related to your CURL / URLConnection issue.

For debugging I will create a separate logfile to store the complete request before I execute it. Maybe a timing issue on my part prevents building up a correct URL in some cases.

I have checked and find no malformed URL’s in the log, so no luck yet reproducing this

So this happens even before there is a connection to the socket from the web app.
Malformed content, packet or something else but clearly not from the xojo-weba-app (receiver) side i don’t see why this could be a socket in the webframework causing this. That probably means there is an issue with libsoup2.4-1 (urlconnection on linux) or something?

Perhaps update the nginx or the main system (where the url connection is calling from) ?
I’m a reading that you create URLConnection instances from an WebApp to another web app ?

Yes the App communicates with another server through the UrlConnection instances. In this case it’s getting data from a Business Central (ERP) service. I also tested on Ubuntu, same result. I will try to make a project to reproduce the error and report the issue.

The App crashes indeed before there is a connection. Crashing seems completely at random. I can repeat the exact same HTTP request multiple times with succes, but at some random point it fails, throwing this error.

It looks like I managed to prevent this error. I think user interaction in my App could cause race conditions leading to too many concurrent http requests. I managed down the number of URLConnections needed and gave asynchronous requests more time to complete. It is running stable since without errors.

2 Likes

Turns out I have the exact same issue today.

My WebApp deployed with Lifeboat crashes and restarts every 30 seconds.

This problem started happening after I upsized the droplet in the DigitalOcean manager an hour ago.

2024-11-18 10:13:09: Application is ready
2024-11-18 10:13:13: Assertion 'close_nointr(fd) != -EBADF' failed at src/basic/fd-util.c:77, function safe_close(). Aborting.
2024-11-18 10:13:17: Application is ready
2024-11-18 10:13:22: Assertion 'close_nointr(fd) != -EBADF' failed at src/basic/fd-util.c:77, function safe_close(). Aborting.
2024-11-18 10:13:25: Application is ready
2024-11-18 10:13:28: Unexpected error 9 on netlink descriptor 13.
2024-11-18 10:13:33: Application is ready

In your case, that assertion seems to be coming from the systemd service, not from the Web Application. I can’t know from the log if it’s throwing that message as a consequence of the web app quitting abruptly, due to the “Unexpected error 9 on netlink descriptor” issue that appears later.

Have you also updated your OS software packages recently? Are the DigitalOcean monitoring graphs looking fine or can you spot something related to the hardware that could be causing this? Any increase in traffic maybe?

Please report this kind of issues, so we can keep a track them and fix them :pray:

Something is terribly wrong on my DigitalOcean droplet so I decided to create a new one and configure the WebApp in the exact same way.

The WebApp on the new droplet isn’t restarting every 30 seconds but every 2-5 minutes now:

2024-11-18 11:06:16: Application is ready
2024-11-18 11:17:44: Unexpected error 9 on netlink descriptor 31.
2024-11-18 11:17:48: Application is ready
2024-11-18 11:18:48: Unexpected error 9 on netlink descriptor 15.
2024-11-18 11:18:52: Application is ready
2024-11-18 11:23:44: Unexpected error 9 on netlink descriptor 25.
2024-11-18 11:23:48: Application is ready
2024-11-18 11:24:11: Unexpected error 9 on netlink descriptor 12.
2024-11-18 11:24:15: Application is ready

Have you also updated your OS software packages recently?

No that I know of.

Are the DigitalOcean monitoring graphs looking fine or can you spot something related to the hardware that could be causing this? Any increase in traffic maybe?

Bandwidth graph is huge since I upsized the droplet.

It looks like the peak in outbound bandwidth is because each browser that has my app open is continuously trying to reconnect to the app.

Thanks a lot. Just to attempt to narrow the issue, are you using SSL through Xojo Web or is it Nginx dealing with it?

I’ve tried to prepare a sample project to force that issue to happen, without any luck, but I’ll prepare a change that tries to protect the web framework against that issue.

I believe it is Nginx directly.

I ended up creating a new DigitalOcean droplet to start from scratch.
I reverted to Lifeboat build 184 and the problem is solved :tada:.

I notified @Tim_Parnell by email about the problem.

There might be something wrong in Xojo Web, but it is induced by using Lifeboat build 201

I will keep droplets 1 and 2 active to get access to the logs if Tim or Ricardo needs them.

Droplet3 logs (running correctly for the past 20 minutes)

2024-11-18 12:27:38: Application is ready

Droplet1 logs:

2024-11-18 10:13:09: Application is ready
2024-11-18 10:13:13: Assertion 'close_nointr(fd) != -EBADF' failed at src/basic/fd-util.c:77, function safe_close(). Aborting.
2024-11-18 10:13:17: Application is ready
2024-11-18 10:13:22: Assertion 'close_nointr(fd) != -EBADF' failed at src/basic/fd-util.c:77, function safe_close(). Aborting.
2024-11-18 10:13:25: Application is ready
2024-11-18 10:13:28: Unexpected error 9 on netlink descriptor 13.
2024-11-18 10:13:33: Application is ready
2024-11-18 10:13:37: Unexpected error 9 on netlink descriptor 24.
[...]
2024-11-18 10:59:33: Application is ready
2024-11-18 10:59:38: Unexpected error 9 on netlink descriptor 33.
2024-11-18 10:59:42: Application is ready
2024-11-18 10:59:52: Assertion 'close_nointr(fd) != -EBADF' failed at src/basic/fd-util.c:77, function safe_close(). Aborting.
2024-11-18 10:59:56: Application is ready
2024-11-18 10:59:59: Assertion 'close_nointr(fd) != -EBADF' failed at src/basic/fd-util.c:77, function safe_close(). Aborting.
2024-11-18 11:00:03: Application is ready
2024-11-18 11:00:07: Unexpected error 9 on netlink descriptor 23.
[...]
2024-11-18 11:02:32: Assertion 'close_nointr(fd) != -EBADF' failed at src/basic/fd-util.c:77, function safe_close(). Aborting.
2024-11-18 11:02:36: Application is ready
2024-11-18 11:02:42: Assertion 'close_nointr(fd) != -EBADF' failed at src/basic/fd-util.c:77, function safe_close(). Aborting.
2024-11-18 11:02:46: Application is ready
2024-11-18 11:03:00: Assertion 'close_nointr(fd) != -EBADF' failed at src/basic/fd-util.c:77, function safe_close(). Aborting.
2024-11-18 11:03:04: Application is ready
2 Likes

@Jacco_Slok could this be related to your issue or were you already using a previous version of Lifeboat?

Jeremie says the error started when he used Lifeboat 201 and upsized a DO droplet.
Lifeboat 201 was not available November 04 when this topic started.
I think there was another Lifeboat version after 184?

I think so, yep

Yes, I remember skipping a version of Lifeboat before downloading b201

I am running 201 now, without issues. My previous version was 198.

I don’t know, I was using version198. I think latency caused a condition where my subclassed UrlConnection sometimes (one in 20 to 50 occasions) started to soon in the Shown event of a webpage. A simple thread.SleepCurrent of 250 milliseconds just before UrlConnection.Send did the trick for me. If I remove this I can replicate my issue again. As said this wasn’t an issue on a Windows Server.

Problem is happening again today:

2024-11-19 13:26:21: Application is ready
2024-11-19 13:26:27: Unexpected error 9 on netlink descriptor 18.
2024-11-19 13:26:31: Application is ready
2024-11-19 13:26:34: Unexpected error 9 on netlink descriptor 30.
2024-11-19 13:26:38: Application is ready
2024-11-19 13:26:47: Unexpected error 9 on netlink descriptor 12.
2024-11-19 13:26:51: Application is ready
2024-11-19 13:26:56: Unexpected error 9 on netlink descriptor 15.
2024-11-19 13:26:59: Application is ready
2024-11-19 13:33:25: Unexpected error 9 on netlink descriptor 48.
2024-11-19 13:33:29: Application is ready
2024-11-19 13:47:48: Unexpected error 9 on netlink descriptor 70.
2024-11-19 13:47:52: Application is ready
2024-11-19 13:48:07: Unexpected error 9 on netlink descriptor 52.
2024-11-19 13:48:11: Application is ready
2024-11-19 13:48:14: Unexpected error 9 on netlink descriptor 11.
2024-11-19 13:48:18: Application is ready
2024-11-19 13:48:22: Unexpected error 9 on netlink descriptor 27.
2024-11-19 13:48:26: Application is ready
2024-11-19 13:49:30: Unexpected error 9 on netlink descriptor 27.
2024-11-19 13:49:34: Application is ready
2024-11-19 13:50:07: App is shutting down: Request shutdown
2024-11-19 13:50:16: Application is ready
2024-11-19 13:50:21: Unexpected error 9 on netlink descriptor 18.
2024-11-19 13:50:25: Application is ready
2024-11-19 13:52:26: Unexpected error 9 on netlink descriptor 22.
2024-11-19 13:52:30: Application is ready
2024-11-19 13:54:20: Unexpected error 9 on netlink descriptor 30.
2024-11-19 13:54:24: Application is ready
2024-11-19 13:54:45: App is shutting down: Request shutdown
2024-11-19 13:54:58: Application is ready
2024-11-19 13:55:05: Unexpected error 9 on netlink descriptor 57.
2024-11-19 13:55:10: Application is ready

I will try Jacco’s workaround.