Odd Nil Object Exception

Hey all,

I’m having an odd NOE error show up in my app when running a specific process off a timer. To make it simple, I have a button control that is stored in a Dictionary that gets virtually pushed when the user schedules it. The button push sends a command via a TCP socket to a device. This device is an object in Xojo. Everything seems to run flawlessly in OS X but not so in Windows it seems. Bear with all my code posting here as I tied it together at the end…

When setting up the TCP connection to the device, I set up a handler for an event that is raised by the device object when the TCP socket connected event is raised. When the code finishes executing, the handler is removed. Here is how I am setting things up. It says Serial port because it is basically an IP-serial device.

Public Function ConnectToSerialPort() as Boolean
  If RS232Mode = "Guest Mode" Then  // We only send commands when the device is in a certain mode
      System.DebugLog "Attempting to connect to the RS232 Port of Device "+me.MyName
    // Add the handler
    Try
      AddHandler SerialPortSocket.Connected, WeakAddressOf SerialPortSocketConnected
    Catch
    End Try
    // Set up the socket
    SerialPortSocket.Port = 6752
    SerialPortSocket.Address = IPAddress
    SerialPortSocket.NetworkInterface = SelectedNIC
    SerialPortSocket.Connect    
  Else
    Return False
  End If 
End Function

The connected event handler of the socket - the method SerialPortSocketConnected looks like this:

Public Sub SerialPortSocketConnected(t as TCPSocket)
// Trying to discover the NOE
  If t = Nil Then Break
  If me = Nil Then Break
  
  System.DebugLog "In SerialPortSocketConnected Event for Device "+me.MyName
  RaiseEvent Connected6752
End Sub

The event Conected6752 is the event that gets the handler added and removed in the action event of the pushbutton. Here is some of that code:

    If Me.DevicePort = 6752 And Not Me.SendToAV Then
     
      System.DebugLog "Need to send command to Device RS232 for ButtonID "+str(me.ButtonID)
      Try
        AddHandler j.Connected6752, WeakAddressOf SendDevRS232   // j is my device with the socket
      Catch
      End Try

      System.DebugLog "Sending RS232 command "+CommandCode+" to Device "+j.MyName+" From RemoteButton "+str(ButtonID)
      SendDevRS232(j)

The SendDevRS232 method (which is also the handler for the device’s Connected6752 event then looks like:

Public Sub SendDevRS232(optional j as JDevice)
  If Not j.SerialPortSocket.IsConnected Then
    Call j.ConnectToSerialPort
    System.DebugLog "Not Connected to Device Serial Port for JDevice "+j.MyName+" Connecting To it Now."
    Return
  End If
  
  System.DebugLog "Connected to Device Serial Port for JDevice "+j.MyName+" Going to Send Commands."
  
// Get the commands from the dictionary
  Dim Commands() As String
  Dim DelayValue() As Integer
  Dim d As Dictionary = SetUpCommandTiming
  
  Commands = d.Value("Commands")
  DelayValue = d.Value("DelayValue")
  Dim TotalDelay As Integer = 0
  
  If DelayValue.Ubound <> Commands.Ubound Then
    Break
  End If
  
// We might have multiple commands with a delay between commands.  So loop it and calculate the delay
  For i As Integer = 0 To Commands.Ubound
    System.DebugLog "Sending Commands to the RS232 Port of Device "+j.MyName+" Command is "+commands(i)
    TotalDelay = TotalDelay+DelayValue(i)
    System.DebugLog "Delay value for command timer for Device "+j.MyName+" is "+str(TotalDelay)
    If TotalDelay < 0 Then TotalDelay = 0
// Send the command to the socket after the delay
    Xojo.Core.Timer.CallLater(TotalDelay, AddressOf j.SendRS232Command, commands(i).ConvertHexStringToAscii)
  Next
 // Remove the handler
  RemoveHandler j.Connected6752, WeakAddressOf SendDevRS232
End Sub

I’ve posted all of this to then ask how odd it is that I get error stacks like this for this Odd NOE exception. Here is some of the stack trace:

RuntimeRaiseException
RaiseNilObjectException
Delegate.IM_Invoke%%<SegmentedControl>
AddHandler.Stub.29%%
JDevice.SerialPortSocketConnected%%o<JDevice>o<TCPSocket>
Delegate.IM_Invoke%%<Segm

That’s all I have as the message box that I have pop up in my app doesn’t show more than that and I have these logged but it appears my client’s computer is not allowing the log files to be saved or is saving them someplace other than where I expect them.

Here’s a second one:

RuntimeRaiseException
RaiseNilObjectException
Delegate.IM_Invoke%%<ComboBox>
AddHandler.Stub.29%%
JDevice.SerialPortSocketConnected%%o<JDevice>o<TCPSocket>
Delegate.IM_Invoke%%<ComboBox>

OK. First one happens with a Segmented Control. The second with a combo box. But what I don’t understand is NOWHERE in my code am I doing ANYTHING with segmented control or a combobox when running this code. Yes, I have one Segmented Control in my app and I have a couple of ComboBoxes, but they have nothing to do with this control or with anything in the underlying code I am using. I am stumped as to what is going on. So I figured I would ask for some help here…

Anyone???

Are you able to reproduce this when running in the debugger?

Not yet. I’ve only seen it in a couple instances. It’s on my list to try to reproduce it in the debugger.

I suspect that the error occurs because the combo box or segmented control itself is Nil (like when a window is in the middle of closing) and that somewhere your code is trying to access a property or method of one of those instances. It’s usually simple enough to wrap the call in a try-catch.

But that’s just it - it doesn’t reference either kind of control. I’ve got one segmented control in the project. I’ve been looking for references like that and so far have come up empty. I’ve posted just about 100% of the code that would be running at the time above.

Well, like I said, the debugger would be best in this case ‘cause it’ll show you the exact point of failure.

The hammer way of seeing if there are any references would be to remove the controls and try a run. If there is a reference, it will pop up immediately.

Well, it happened again this morning when the automation action in my app started. And it’s yet a third control being shown. And this control has NOTHING to do with the socket connection. I’ve got a bunch of these connections all going on out to different sockets - in this case it’s looping through something like 20 different devices and making connections to each. I have a number of debug log statements. Here’s the output from DebugView. I dumped the stack trace into DebugView so I could get the whole thing.

00001361	29440.39257813	[8548] Need to send command to JDevice RS232 for ButtonID 0	
00001362	29440.39257813	[8548] Sending RS232 command 0x500x4F0x570x520x310x200x200x200x0D to JDevice Team From RemoteButton 0	
00001363	29440.39257813	[8548] Attempting to connect to the RS232 Port of Device Team	
00001364	29440.39257813	[8548] Not Connected to JAPDevice Serial Port for JDevice Team Connecting To it Now.	
00001365	29440.40039063	[8548] In SerialPortSocketConnected Event for JDevice Team	
00001366	29440.64453125	[8548] Error Stack is:  	
00001367	29440.64453125	[8548] RuntimeRaiseException 	
00001368	29440.64453125	[8548] RaiseNilObjectException 	
00001369	29440.64453125	[8548] Delegate.IM_Invoke%%o<ScreensCC.ScreensCC> 	
00001370	29440.64453125	[8548] AddHandler.Stub.29%% 	
00001371	29440.64453125	[8548] JDevice.SerialPortSocketConnected%%o<JDevice>o<TCPSocket> 	
00001372	29440.64453125	[8548] Delegate.IM_Invoke%%o<ScreensCC.ScreensCC> 	
00001373	29440.64453125	[8548] AddHandler.Stub.3%% 	
00001374	29440.64453125	[8548] HideMouseCursor 	
00001375	29440.64453125	[8548] HideMouseCursor 	
00001376	29440.64453125	[8548] FigureShapeAddCubic 	
00001377	29440.64453125	[8548] enableMenuItems 	
00001378	29440.64453125	[8548] Application._CallFunctionWithExceptionHandling%%o<Application>p 	
00001379	29440.64453125	[8548] enableMenuItems 	
00001380	29440.64453125	[8548] RuntimeRun 	
00001381	29440.64453125	[8548] REALbasic._RuntimeRun 	
00001382	29440.64453125	[8548] _Main 	
00001383	29440.64453125	[8548] wWinMain 	
00001384	29440.64453125	[8548] __chkstk 	
00001385	29440.64453125	[8548] BaseThreadInitThunk 	
00001386	29440.64453125	[8548] RtlUserThreadStart 

So I get the NOE happening as soon as I go into the connected event of the socket. No windows were closing. No one was at the computer doing anything. This is a process that ran at 6 in the morning to turn devices on at the customer site. And the control that is referenced is in a window and only in a window that is not even open at the time! Has it been opened. Yes - likely at some point but it’s not now!

So I am not certain that this error is in my code and not in the framework.

I’m definitely going to try running the same setup as my customer in the debugger.

And furthermore it seems like if I schedule the automation in my app to run like 5 minutes from now, everything executes just fine. It’s when it’s scheduled for 6 or 8 hours from now that the problem happens. I just logged onto my customer’s site and set up the same automation that crashed above to run in about 5 to 7 minutes from now. I’ll go back in and check what happens. I am sure it runs fine.

So I would think that if I can make it run fine now, it should run fine later. But that’s not what is happening…

You are right, but every time the error happens, it’s a different control.

So here’s the code from the connected handler for my TCPSocket:

Public Sub SerialPortSocketConnected(t as TCPSocket)
  If t = Nil Then 
    System.DebugLog "Timer is Nil in SerialPortSocketConnected Event for JDevice "+me.MyName
    Break
  End If
  
  If me = Nil Then 
    System.DebugLog "JDevice "+me.MyName+" is Nil"
    Break
  End If
  
  System.DebugLog "In SerialPortSocketConnected Event for JDevice "+me.MyName
  RaiseEvent Connected6752
End Sub

So the TCP socket is not Nil and the Device object where the code is running didn’t go Nil as neither of those messages got logged. So the problem is happening when I go to RaiseEvent Connected6752

Now, if the object that contains the handler for the Connected6752 event has gone Nil, I understand that a NOE could be raised, But that object is not what is being shown in the stack trace. Perhaps that object is going NIL and the framework doesn’t know where to trace it to? But that object shouldn’t be going Nil as far as I can tell. I’d expect the object to go Nil if I run the code to execute in five minutes as well as in five hours…

So I just checked my customer’s site after scheduling the code to run after a five minute delay. It ran just fine. No errors. I’ve now got schedules to run in one hour, then two hours after that, then three hours after that, then four hours after that. So I’m attempting longer intervals throughout the day to see when it goes awry.

I’m looking at my code some more. Based on the logging and the stack trace above, my code is never getting into the event: Connected6752. That event is handled by the SendDevRS232 method in my pushbutton object. Now I have that handler added using WeakRefs. So if the button object goes NIL for some reason, then as it’s a WeakRef, the handler should go away as well - right? Would using a strong reference (as opposed to a WeakRef) prevent the button from going NIL? I’d expect that if I used strong refs that if the button would go NIL that I’d get the NOE not the other way around…

So I am still stumped… I’ve got to try to duplicate this in the debugger for sure…

So an update - when the code ran an hour later (at 9:22 AM this morning), the NOE error occurred. So no NOE, for a five minute delay but you do get an NOE after an hour. I am in the process of running this in the debugger on my Windows laptop. Unfortunately, I can’t debug in 64 bit but that’s probably not the issue…

Jon, are you using Threads at all?

I’m not using threads. No need as everything happens pretty quickly.

But I just ran the app in the debugger on my Windows laptop and I got the NOE the first time I tried it. The error stack didn’t show any other controls but where the NOE is happening is when I call

RaiseEvent Connected6752

Now, is it possible that the object that was set for the handler of that event has gone NIL? I guess it is and maybe that’s what is happening. But I specifically used WeakRefs when setting up the handler. So if the object goes away the WeakRef should go away and therefore it should act as an unimplemented event should it not? While that’s not the behavior I want, it would be a silent failure instead of an exception.

No, it doesn’t. What you have is a handler that is now nil but not removed.

Thank you for confirming that. I was beginning to come to this conclusion myself. That the handler is still implemented. WeakRef just lets the object go out of scope.

So that is what is causing the NOE. I realized after thinking about it that my buttons are not stored in a dictionary - just the information need to create them is stored in the dictionary. I create them on the fly and then “push” them virtually (sometimes they do map to actual pushbuttons though which is why I use a button object). So after adding the handler and pushing them, the call is made to connect to the socket if it is not connected. The button then completes its code and is marked to go out of scope. If the socket connects quick enough, the button hasn’t been cleaned up yet by Xojo and so it still exists and it works. But if the socket connection takes longer, then the button goes out of scope and I get the NOE.

So maybe what I want to do instead is to use a strong ref instead of a WeakRef. Use AddressOf instead of WeakAddressOf. Then that should keep the reference around to the button. Then once I remove the handler after my code is complete, the button can go out of scope.

Is that the right way to do it or should I actually pass the button to my device that has the socket as a property or parameter?

Yes, if you use a AddressOf then it will keep the item around until you remove the handler. But to remove the handler you need a reference to the object that the handler was for.

Of course. And I also realized that I may have cases where my socket does not connect for whatever reason. If that is the case, then the handler will stick around forever too. So I’m adding a timeout timer to the socket which will raise a different event which will then be raised to the button object which will then remove the handlers and all. I think I have it figured out. Thank you!

What was throwing me in all this was the fact that other controls were brought into the stack when the exception happened. Must just be the framework doing a refresh or something with the controls that just so happened to execute while the exception was happening. I have no idea why that was happening…