StackOverFlow Error in Framework

Hey all,

I’m getting reports from a customer where my app is generating a stack overflow error and it looks like it’s all Xojo framework routines. Here’s a short bit of the stack:

DispatchMessageW
FigureShapeAddCubic
CustomControlCreatePane
ApplicationSupportsHiDPI
GetWindowRgn
DispatchMessageW
FigureShapeAddCubic
CustomControlCreatePane
ApplicationSupportsHiDPI
GetWindowRgn
DispatchMessageW
FigureShapeAddCubic
CustomControlCreatePane
ApplicationSupportsHiDPI
GetWindowRgn
DispatchMessageW
FigureShapeAddCubic
CustomControlCreatePane
ApplicationSupportsHiDPI
GetWindowRgn

So this repeats over and over and over again until the app crashes. I don’t know what’s driving it either. My app is grabbing still images from some video devices and then displaying those images in canvases. It’s doing this maybe 10 or 12 times a second - as fast as possible.

Can anyone (particularly anyone at Xojo), give me a hint as to what might be causing this? Is it an out of memory error, a HiDPI error or what? And what’s odd is I don’t support HiDPI yet looks like one of the methods being called is “ApplicationSupportHiDPI”…

Thanks,

Jon

Are you calling Refresh from within a Paint event by chance?

Not that I can tell…

Any App.DoEvents ?

Nope.

I do a lot of calling stuff by timers so everything should be running fairly independently.

You may want to use the profiler. It may give an additional clue.

Agreed. I’m trying to duplicate the error at my end. Of course, I could give the user a build made with the profiler enabled. That’s a very good idea…

Search for Refresh in your code and replace it with Invalidate if you can. The difference is that Refresh will redraw control immediately whereas Invalidate will tell the system to do it during idle time. As a result, you can call Invalidate multiple times without an additional performance hit.

Not saying this is the problem, per se, but good practice anyway.

I could but I’m trying to draw as fast as I can. But maybe I’m calling too fast for refresh?

you might try setting a flag when you refresh and unset it in the paint event so you only have one refresh going on at a time.

Good idea, Greg.

I was able to let my app run for about 20 hours and ended up getting the crash myself. Now to let it run in the debugger until it happens. Then I can get a better sense of things as the stack trace isn’t telling me a lot…

You should definitely use a flag as instructed by Greg. There are very much chances your refresh hits the canvas the same time as it is already painting, and the collision can have this kind of consequence.

I also strongly believe using refresh does not speed things up. You may want to simply do a global replace and see how the program performs with Invalidate.

So, I had my customer try out a build using Invalidate instead of Refresh. Same problem. Still get the stack overflow. It varies as he got it after maybe 30 minutes. I have my system running the same build and have been running for 12 hours with no issue. So not sure what is causing this.

Hey guys, an update on this.

1.) I’ve put code in place to prevent multiple paint events being called at the same time. No difference. The app still crashes with a stack overflow error after some period of time running. The stack overflow is in the framework and in the debugger, I can’t even get to see what it is. As soon as I try to look at the exception, Windows comes and says there’s a problem with my app and has me close it.

The exception is always happening during an execution of the paint event. Every time. Generally it happens when calling some other function in the paint event code. I have one function that creates an graphical overlay on the canvas if the mouse is inside the canvas. That seemed to be the place where the exception happened. So I commented out the call to that function to see if that was causing an issue. No, now the exception happened when making the call to the getter of a computed property.

2.) I’ve been using a lot of timers. So when I get a successful image pulled from my device(s), I then process the bitmap into a picture object, then I use a timer to call invalidate on the canvas. I first was using Xojo.Core.Timer.CallLater. I could get the exception to happen after perhaps 12 to 24 hours of run time on the app. So being that I’ve had other crashes internal to the Xojo.Core.Timer object, I decided to use the SingleActionTimer that Karen and another user created. The app actually ran longer using that timer - almost 36 hours or more. Still the crash happened.

3.) I’ve now gone to eliminating the timer and calling the method that calls invalidate directly. I started testing that this morning. We’ll see how long it goes. Maybe there’s something going on with all these timers sitting out there and firing at different times that is causing it. But I know for a fact, that I’m not calling invalidate while the paint event code is executing.

I have a feeling this is indeed something in the bowels of the framework as nothing in my code should be causing this. However, getting an example app put together so Xojo can see it in real time might not be so easy to do… :frowning:

Jon, if you want a second pair of eyes to look over your code, I’m happy to do it. If you’re not comfortable sending it to me, or that’s not practical, we could set up a remote session.

Thanks. I really don’t have a problem with it. Let me see if I can distill down the key parts of what I’m doing and show you that. I know one time I had another question about what I was doing and I explained it all and Greg said that it was exactly correct. But let me see what happens here with my current experiment of not using the timers and maybe if it still does, I can send you some code…

It would be better if I looked at the original. The problem with distilling it down in a case like this is that you might “distill” away the issue, if it’s in your code to begin with.

Good point. I thought of that. There’s a bit that is unused in one of my main objects because this app doesn’t use quite a few of its functions. And there’s a bunch of stuff I’ve created, started to use and then abandoned, that I’ve not gone back and cleaned up. Still perhaps you can figure your way through it. Not the cleanest project. But I think the overall way I am doing things isn’t too crazy. And I could walk you through what happens. Let me consider this…

From the LR it is said that the stack stores all the method names from the entry point until the error.

I wonder if it is not simply overflowing the stack because of the sheer number of methods in there. 12 times a second over 36 hours means a lot of method names. From what you posted first, I counted approximately 240 characters.

That comes to 373,248,000 characters in 36 hours. I read that the stack in Unix is 256K. That is certainly over that.

Elsewhere I read that the default is 512 K for threads. But I could not get stacksize for the main thread.

There may be a solution. I read in the LR that function names are only stacked if IncludeFunctionNames is used in App (Shared build settings). You may want to try switching it off and see if it alleviates the stack overflow.

BTW what you describe seems identical to Xojo Programming Forum

Michel,

It’s a
Good thought. However…

How long does Xojo store the stack? Forever and always? If that’s the case then no program could be left running indefinitely since it would always fill up.

My thought was that the stack is the current list of instructions being called. Once that is done, the stack should reset to zero. As I’ve been using timers, each one should really be a very small set of executed instructions.

I don’t think my issue is the same as the one in the Forum post as that issue had to do with size allocation of the stack within threads. I’m not using any threads (and I don’t think I had this problem when I did).

But as a reply in the forum post points out, stack overflow exceptions almost always happen when the same loop of code is called over and over and over again. Look at the example I posted above. That is exactly what is happening. The methods in the framework are recursively calling each other over and over again until it crashes. That’s why I don’t think this is caused by calling multiple paint events. In fact the full stack capture in the error is always longer than what I posted and starts at some unrelated point in my app. It’s like the framework suddenly breaks down. In debugging it’s always in the paint event. But none of my calls to any methods are repeated. It’s only the framework methods that are calling each other over and over and over.