TCP Socket error when connection is lost

Perry_Paolantonio · November 4, 2021, 8:58pm

How would I catch a lost connection in a TCP socket? The error code for this appears to be 102. The documentation says:

You will get this error if the remote side disconnects (whether its forcibly- by pulling their ethernet cable out of the computer), or gracefully (by calling SocketCore’s Close method). This may or not be a true error situation. If the remote side closed the connection, then it is not truly an error; it’s just a status indication.

But if I have an open connection and then turn the server off, my client still seems to think it’s connected. In the Error event handler for the client I have the following code:

var msg as string = err.ErrorNumber.ToString
ClearCoreAlerts.AddText ( msg )

This should print the error message to a text area where I can see it was raised. But when I turn the server off and wait, and wait, and wait (for many minutes), it’s never raised. If I send a command, it’s just being sent into the ether.

What do I need to do in this situation to know that the connection has been broken?

Arnaud_N · November 5, 2021, 10:21am

It looks like the server just stops processing things but stays connected…
What’s your “server”?

Perry_Paolantonio · November 5, 2021, 11:02am

It’s a Teknic Clearcore microcontroller. I am physically cutting the power to it so the server is completely dead.

Sunil_Abraham · November 5, 2021, 11:02am

I am also in the same boat. I dont know why xojo cant detected a disconnection. I tried with a program development by java and c#, both detects the disconnection.

James_Sentman · November 5, 2021, 1:45pm

When you experimented with other things did you use a raw socket or set a bunch of options on it? I suspect, but am not certain, that there is something like TCP Keepalive that you need to set in order for the error to be noticed in anything resembling a normal amount of time. Or possibly there is some other timeout that can be set that Xojo isn’t doing by default. When building my own connection protocols I’ve always implemented a simple ping packet back and forth to more reliably catch disconnections. When talking to something that I can’t control there is often, but not always, something you can do something similar with. One device that I have to work with doesn’t catch the disconnect message and has no ping type message, but you can request the firmware version as often as you like and I do that if it has been more than a few minutes since the last time I got any valid data from it in the way of a ping.

Perry_Paolantonio · November 5, 2021, 1:59pm

I’m setting it up using the GUI - dragging and dropping a TCP Socket to my project. I’m setting a default IP and port, and also have a preferences window where the user can override that. The entirety of the code that initiates the connection is:

ClearCoreSend.Address = App.remoteServerIP
ClearCoreSend.Port = App.remoteServerPort.ToInteger
ClearCoreSend.Connect

I don’t see a keepalive option in Xojo TCPSockets.

The firmware developer will be implementing something like what you’ve described, to determine if the connection is good from the client on the Teknic box (either pinging my software or more likely sending a command that will elicit a response from my Xojo server). I’ll do the same for the situation I’m dealing with - there are already some basic status functions I can test that with, but we’ll probably make something specific to see if the ClearCore is up and running.

Perry_Paolantonio · November 5, 2021, 2:07pm

I just added a button to my UI with this code in it:

var msg as string
if ClearCoreSend.IsConnected = true then
  msg = "Connection to ClearCore Up"
else
  msg = "Connection to ClearCore Down"
end if

responseMessages.AddText( msg.DefineEncoding(Encodings.UTF8))

ClearCore OFF, No connection started from the software:
Connection to ClearCore Down
ClearCore ON, Connection established from the software:
Connection to ClearCore Up
ClearCore powered off, after establishing connection:
Connection to ClearCore Up

So IsConnected doesn’t work, since it thinks the connection is still valid even though the server is powered off

I mean, this feels like a bug to me.

James_Sentman · November 5, 2021, 2:11pm

Sorry, that part of the message was meant to be a response to Sunil Abraham who had suggested that in C# and Java it worked fine. I was wondering what his standard creating of a socket was in either of those, and possibly if some of those lower level flags were being set. You can’t set them in Xojo without some declares or plugins. You can set them with the MBS plugins I know as I’ve experimented with that but not to specifically find an answer to this problem.

Perry_Paolantonio · November 5, 2021, 2:15pm

So this has me wondering - is keepalive client-side only or is it possible for the server to somehow convince a client to keep trying even in the event of a disconnect?

I’m wondering if something on the server side is issuing something like a keepalive when the connection is established, that Xojo is responding accordingly to, even when the server goes offline. or is that not even possible?

kevin_g · November 5, 2021, 2:17pm

I don’t think Xojo sockets have any way to know if the connection is still alive.

If you call Socket.Poll on a regular basis you might find that it sets the error state when the socket is invalid otherwise you probably need to send some kind of ping command on a regular basis.

Perry_Paolantonio · November 5, 2021, 2:31pm

Hmm. So then is IsConnected only for testing if you’ve established the intitial connection and nothing more? the docs for TCPSocket.IsConnected seem to say otherwise.

What about Error 102, which should get tripped when the connection is physically gone? (to test this, I did exactly what the docs say should trigger this, unplugging the ethernet connection from the server), and it still doesn’t trigger an error.

kevin_g · November 5, 2021, 2:51pm

From my own experience, that is how it appears to work.

I’m pretty sure you only get that error when you to perform an action on the socket after it has been disconnected.

Perry_Paolantonio · November 5, 2021, 2:53pm

Would sending something over the socket be considered “performing an action?” – because I can send stuff, but get no response or error.

I’ll try seeing if socket.poll gives it a kick, and if so, maybe set that up in a timer or something.

Perry_Paolantonio · November 5, 2021, 3:09pm

So I added a “Poll” button that calls ClearCoreSend.poll. I established a connection (powered on my software, clicked the Connect button, then powered on the ClearCore unit). Once the ClearCore finishes its initialization routine, I am able to send and receive with it.

Powered off the clearcore while my software is still running. Clicked the Poll button, Issued a command.

What seemed like two-three minutes later, a 102 error was raised. So there is an incredibly long timeout here, and that makes me wonder if @James_Sentman’s idea about keepalive being active is what’s happening.

We can’t wait that long, so I guess the only real option is to create some kind of ping on a timer, that tests the connection periodically.

Perry_Paolantonio · November 5, 2021, 8:00pm

Actually, how would one go about implementing a server ping test in Xojo? that is, if I issue a command using TCPSocket.write, the data is received in the DataAvailable event. That works fine.

But If I’m issuing a command to the server like GetServerStatus and expecting a response back (say, an integer 1 for “I’m here”), how can I associate the 1 in the DataAvailable event, if it’s coming in asynchronously, with the test that’s checking to see if the server is alive?

My thought was to set up a timer that just checks the server every x seconds, but in order for that to work, the timer needs to wait for a response. But that response is getting picked up by the DataAvailable event, and is handled by the code there. It has no idea the timer triggered a server checkup.

The problem is that what I’m really interested in here is the absence of a response, so my regular DataAvailable code can have no idea that it has to respond to an absence of data. Does that make sense?

TimStreater · November 5, 2021, 8:52pm

In addition to the inetger 1, the response needs to have a code with it to indicate that it is a response to a status request and not just some data thats coming in. Your server should expect requests to be labelled as part of the data stream. Its responses should be labelled, too. Then you can mix different requests together and be able to sort out which response goes with which request. Absence of a response to a keepalive should be done with a timer.

Perry_Paolantonio · November 5, 2021, 8:58pm

Right, I get that – all of our other commands come back with a code telling us what it’s in response to.

But my question is: how do I know that I didn’t get a response if I didn’t get a response? I guess I could have one timer put a “serial number” into an array when it issues a ping request. Then another timer could check for responses (the server would echo back the serial number), and then timer #2 deletes serial numbers for which it has received responses, and then if there’s more than some number of items in the array we can assume there were no responses for those serial numbers and that the connection was lost.

My gut tells me that could be a problematic way to do this though. There must be a simpler way, no?

TimStreater · November 5, 2021, 9:00pm

I’d have thought that if you set a timeout of say 20 secs and your numbered request hasn’t caused a same-numbered response in that time, then you can declare the connection lost.

Perry_Paolantonio · November 5, 2021, 9:03pm

20 seconds is way too long though. We’re talking about a timer that’s going to poll the server every 2 seconds at the longest, I think. I need to know quickly if a connection was lost.