Hey everyone - I’m not sure, but I think I’ve found a Nil Object Exception happening in the Xojo framework for socket code. I have not had a chance to get more information from my customer’s machine yet (my app writes detailed error report logs that will help me pinpoint exactly where this is happening… hopefully later today) but I do have the following stack trace:
See, the problem with that is that the new sockets are asynchronous only, so “try using the newer socket” is really not a super helpful solution for me right now. Doing so will force a LOT of refactoring in a very complex project - particularly because I don’t even know which of my sockets are throwing this error… the app crashes before getting into any of my code, so I can’t tell what is causing it. This means I’ll have to change ALL the sockets in my app (and there are a bunch for doing a bunch of different types of things). The potential for introducing all kinds of new issues with a refactor like that is of greater concern to me - particularly because the future roadmap of the new Xojo.Net.HTTPSocket is not very clear, given the discussion at the last XDC.
I assume if I want this to get looked at in the old socket at all, I’ll have to create a sample application that demonstrates the issue and open a feedback? This is also not great for me, as it takes time I do not have, and this issue is only intermittent for my users, so I cannot reliably reproduce it.
I wish I could. This app uses a bunch of sockets to talk to a bunch of different webservices (all with a bunch of different endpoints). Since I have no idea which socket is causing the NOE, I don’t really have any idea even which webservice is causing the issue, let alone which specific URL / endpoint.
While I’m sure this is true, I’m also sure that someone could at least take a look at the code in HTTPSecureSocket.ParseHeaders() to see what is the likely culprit throwing a NOE.
I frequently get error reports from my users that are not able to be reproduced - but my logging in the app at least narrows things down to the method where the problem is happening, and I’m always able to figure out what could possibly be triggering the error, based on the code in the suspect method.
The version of my app that is throwing this error was built with either 2018R2 or 2018R3B1, not sure which.
The endpoints that my app talks to have not changed (that I know of) significantly lately - though there may be changes under the hood to some of them that I’m not aware of.
I’d humbly suggest that the above information is enough to at least take a look at that method to see what could possibly be throwing an NOE, and guard against it. (pre-check things for nil, catch an exception and handle the error correctly, etc).
Just to add one more guess: are you sure the HTTPSecureSocket is still (and always) referenced (while it tries to fetch data)?
I’m thinking about a Thread that has a HTTPSecureSocket. Thread gets killed, while Socket is still trying to fetch. It might no longer be fully there when the Headers finally have arrived.
I faintly remember something similar with a HTTPSocket on a window. Window got closed, while the Socket was still trying to fetch.
@Kimball Larsen I’ve looked at your case and while I appreciate that you filed it, we’re still no closer to having reproduction steps. I stand by my original assessment though. Some newer services don’t honor an HTTP/1.0 request and if these guys are doing that, the HTTPSecureSocket may not be able to handle the response.
My guess is that in some circumstances, the response from this call is big far too big for the server to load into memory all at once. HTTP/1.1 servers have a new mechanism called “chunked” responses where the total size of the response is unknown and is streamed in smaller chunks. I can tell you for sure that HTTPSecureSocket can’t handle a response like that.
As I mentioned on the case:
Otherwise without the ability for us to reproduce the exact condition that the API is in, we’ve got no chance to reproduce it.
Here’s the raw response to that request without any credentials using the HTTPSecureSocket. It actually tells us a lot:
HTTP/1.1 404 Not Found
Server: nginx/1.10.3 (Ubuntu)
Date: Wed, 03 Oct 2018 21:11:18 GMT
Content-Type: text/html; charset=utf-8
Strict-Transport-Security: max-age=31536000; includeSubDomains
See that first line? The server is returning an http/1.1 response, regardless of the fact that the socket is telling it that it can only handle a 1.0 request.
The only remedy for this is to upgrade to the new sockets.
Also, would you mind masking the exact URL of the endpoint I provided in the feedback case? I note you posted it publicly above, and don’t really want it available publicly (which is why I made the case private in the first place) .