NOE In The Xojo Framework Socket Code?

Hey everyone - I’m not sure, but I think I’ve found a Nil Object Exception happening in the Xojo framework for socket code. I have not had a chance to get more information from my customer’s machine yet (my app writes detailed error report logs that will help me pinpoint exactly where this is happening… hopefully later today) but I do have the following stack trace:

EXCEPTION: Nil Object Exception:
//------------------------------//
RuntimeRaiseException
RaiseNilObjectException
HTTPSecureSocket._ParseHeaders%%os
HTTPSecureSocket._HandleNewData%%o
HTTPSecureSocket.Event_DataAvailable%%o
SSLSocket.dylib$3211
SSLSocket.dylib$3210
SSLSocket.dylib$3208
SSLSocket.dylib$3200
SSLSocket.dylib$3181
SSLSocket.dylib$3189
SSLSocket._Poll%%o
_Z16PollPollableListv
XojoFramework$7109
XojoFramework$7891
CFRUNLOOP_IS_CALLING_OUT_TO_A_TIMER_CALLBACK_FUNCTION
__CFRunLoopDoTimer
__CFRunLoopDoTimers
__CFRunLoopRun
CFRunLoopRunSpecific
RunCurrentEventLoopInMode
ReceiveNextEventCommon
_BlockUntilNextEventMatchingListInModeWithFilter
_DPSNextEvent
-[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:]
XojoFramework$4513
XojoFramework$4514
Application._CallFunctionWithExceptionHandling%%op
_Z33CallFunctionWithExceptionHandlingPFvvE
XojoFramework$4513
-[NSApplication run]
RuntimeRun
REALbasic._RuntimeRun
_Main
main

//------------------------------//

Note that the NOE appears to be thrown in the HTTPSecureSocket._ParseHeaders.

Has anyone else seen this?
I’m afraid I don’t know yet which version of Xojo was used to build this, and I won’t find out until I get more details from the user’s computer later today.

The customer is running OS X 10.13.6.

Thanks!

My guess is that the server doesn’t return http/1.0 compliant responses and that it’s tripping up our parsing code. Try using the newer Xojo.Net.HTTPSocket and I’ll bet you have better luck.

See, the problem with that is that the new sockets are asynchronous only, so “try using the newer socket” is really not a super helpful solution for me right now. Doing so will force a LOT of refactoring in a very complex project - particularly because I don’t even know which of my sockets are throwing this error… the app crashes before getting into any of my code, so I can’t tell what is causing it. This means I’ll have to change ALL the sockets in my app (and there are a bunch for doing a bunch of different types of things). The potential for introducing all kinds of new issues with a refactor like that is of greater concern to me - particularly because the future roadmap of the new Xojo.Net.HTTPSocket is not very clear, given the discussion at the last XDC.

I assume if I want this to get looked at in the old socket at all, I’ll have to create a sample application that demonstrates the issue and open a feedback? This is also not great for me, as it takes time I do not have, and this issue is only intermittent for my users, so I cannot reliably reproduce it.

Color me frustrated.

If you have the MBS plugin, CURL might be an option to try.

Unfortunately without that we have zero chance of reproducing it on our end. Could you at least give us a URL to work with?

I wish I could. This app uses a bunch of sockets to talk to a bunch of different webservices (all with a bunch of different endpoints). Since I have no idea which socket is causing the NOE, I don’t really have any idea even which webservice is causing the issue, let alone which specific URL / endpoint.

While I’m sure this is true, I’m also sure that someone could at least take a look at the code in HTTPSecureSocket.ParseHeaders() to see what is the likely culprit throwing a NOE.

I frequently get error reports from my users that are not able to be reproduced - but my logging in the app at least narrows things down to the method where the problem is happening, and I’m always able to figure out what could possibly be triggering the error, based on the code in the suspect method.

The version of my app that is throwing this error was built with either 2018R2 or 2018R3B1, not sure which.

The endpoints that my app talks to have not changed (that I know of) significantly lately - though there may be changes under the hood to some of them that I’m not aware of.

I’d humbly suggest that the above information is enough to at least take a look at that method to see what could possibly be throwing an NOE, and guard against it. (pre-check things for nil, catch an exception and handle the error correctly, etc).

Two engineers have already looked at that code and there is no obvious place that a NilObjectException could occur. The method takes a string and splits it into an array of key value pairs.

Knowing what those endpoints are, even if it’s in a private case is the only way that we’re going to track this down.

Opened feedback case with more details: <https://xojo.com/issue/53588>

Thanks!

Just to add one more guess: are you sure the HTTPSecureSocket is still (and always) referenced (while it tries to fetch data)?
I’m thinking about a Thread that has a HTTPSecureSocket. Thread gets killed, while Socket is still trying to fetch. It might no longer be fully there when the Headers finally have arrived.
I faintly remember something similar with a HTTPSocket on a window. Window got closed, while the Socket was still trying to fetch.

@Kimball Larsen – I’ve looked at your case and while I appreciate that you filed it, we’re still no closer to having reproduction steps. I stand by my original assessment though. Some newer services don’t honor an HTTP/1.0 request and if these guys are doing that, the HTTPSecureSocket may not be able to handle the response.

My guess is that in some circumstances, the response from this call is big far too big for the server to load into memory all at once. HTTP/1.1 servers have a new mechanism called “chunked” responses where the total size of the response is unknown and is streamed in smaller chunks. I can tell you for sure that HTTPSecureSocket can’t handle a response like that.

As I mentioned on the case:

Otherwise without the ability for us to reproduce the exact condition that the API is in, we’ve got no chance to reproduce it.

Here’s the raw response to that request without any credentials using the HTTPSecureSocket. It actually tells us a lot:

HTTP/1.1 404 Not Found Server: nginx/1.10.3 (Ubuntu) Date: Wed, 03 Oct 2018 21:11:18 GMT Content-Type: text/html; charset=utf-8 Content-Length: 0 Connection: close X-Request-Id: b3931d99-6f0f-4d83-aa1f-68c3ce8fec41 X-Runtime: 0.000921 Strict-Transport-Security: max-age=31536000; includeSubDomains
See that first line? The server is returning an http/1.1 response, regardless of the fact that the socket is telling it that it can only handle a 1.0 request.

The only remedy for this is to upgrade to the new sockets.

sigh. Ok, off I go to refactor.

Also, would you mind masking the exact URL of the endpoint I provided in the feedback case? I note you posted it publicly above, and don’t really want it available publicly (which is why I made the case private in the first place) .

Thanks for looking deeply into this.

Done.