Entire Web Server Hanging from CGI App

All programs and sites on my Linux web server get hung up from something that is going on in a fairy simple CGI app. What could cause this?
Even a forced infinite loop (which I thought would hang the server) is not hanging the server.
The app uses its own Sqlite database - never had issues before. Nothing is showing up in the error.log.

What could take down the server?

Are you sure the entire server is hung, or just Apache? Assuming this is a VPS, SSL into a terminal and run top to see which process(es) are spiking the CPU.

OK - I knew it would come to this at some point. I need to learn how to do that. What is the best way for me to get up to speed on how to do what you are suggesting?

I have a VPS with GoDaddy, using Plesk as the server management console.
Do I do this through something like PuTTY or do I use Plesk?

If I am reasonably sure that it is my CGI app causing the issue - what should I be looking for?

I have been using it for years. Quite convenient :slight_smile:

OK - so can you walk me through the high level steps I need to do to look at CPU activities on the server? Please.

Sorry, I mean SSH, not SSL.

I can’t remember whether Plesk has a built in SSH client, and I don’t have any Plesk managed servers any longer (I use Amazon EC2 servers for everything now). I do know Plesk has CPU monitoring, but again, I can’t guide you to it as I don’t have one to look at any longer.

Since you mention PuTTY, I assume your computer is Windows. If it’s Mac or Linux, then you can just use the ssh terminal command directly. You need a login and password that is enabled for remote SSH connections on your server. Then just follow the documentation for PuTTY (should come with it, or get it here).

If you can get connected and run Top, look to see if either your app or the Apache process (httpd) is pegging the CPU.

Have you tried building and running one of the sample web apps that comes with Xojo? That can help determine if the problem is really with your app, or with your server.

I have multiple web apps that I have built with Xojo running on this server - and this particular app that I suspect has been operating for 6 months. I do update the app every month or so - so that is why I suspect I (or the framework) is doing something new.

I am trying to get connected with PuTTY right now - will see what I find. Unfortunately, the system isn’t hung at the moment - so I may not learn anything - other than how to look for things when it does break.

OK - got connected. What is the SSH command I should use to tell me what I am looking for?

Just enter in top. You will get a real time display of the system and processes (like Task Manager in Windows). The display automatically sorts the processes by CPU usage, highest at the top. When the problem occurs you want to see if any processes are maxing out. You can kill a process from here by hitting ‘k’ and entering the PID of the process. Use ‘q’ to exit top.

What I mean is just type ‘top’ and hit enter, like any other command prompt function.

Top command shows my app consuming 100% of CPU - interesting. Now I need to figure out why.

It shows the umbrella name for my app at 100% - not the appname.cgi - not sure what that means.

The binary app is where the work is done. The cgi app is just a communicator between Apache and the binary app.

So Apache is not pegging? It’s strange that just the binary app would cause all other web server functions to stop.

I tried to kill - got this…
Kill PID 17460 with signal [15]:

Then I typed y
It didn’t like that - error message indicating I don’t have permission to kill that app.

I will see what happens after I restart the server - see if that app is still pegging the CPU.

[quote=100358:@Mark Pastor]I tried to kill - got this…
Kill PID 17460 with signal [15]:

Then I typed y
It didn’t like that - error message indicating I don’t have permission to kill that app.

I will see what happens after I restart the server - see if that app is still pegging the CPU.[/quote]
The ‘15’ in square brackets is the default value that will be used if you just hit enter. There are different levels to try with kill. ‘15’ means kill by all means (extreme prejudice).

You may need to use sudo (‘do’ as the 's’uper 'u’ser) when running top. Enter ‘sudo top’.

OK - great help so far - thanks Jay! After restarting the server - the process stopped gobbling up CPU. Then I invoked the app, and processes were generally in the 3% or less level. Closed the CGI app web page and CPU usage of app went to 0% - but still shows up in memory. I will use these processes to run some experiments. Will come back to this conversation as more info is available and more insights are needed.

Additional data and a question.
The reason my app was pegging the cpu at 100% is because, as I mentioned above, I wrote an infinite loop to try to emulate the server hang I am experiencing. After restarting the server, and running my app - nothing was breaking (no CPU at 100%; no server hangs). So then I experimented by running the infinite loop - and sure enough the CPU was pegged at 100%. But the server wasn’t hanging! So I ran a second session of the infinite loop - and it ran - CPU pegged at 100%. Then I ran my other unrelated CGI web app - runs fine, even though 2 sessions of original app are pegging the CPU.

Conclusion - an app pegging the CPU cycles is probably not what is causing server hang - which I think, Jay, is where your original thoughts were.

And now for the question…
So, if NOT an app pegging cpu cycles, then what would cause a server hang?

[quote=100378:@Mark Pastor]So, if NOT an app pegging cpu cycles, then what would cause a server hang?

[/quote]
First you need to define what a “server hang” is. Is the entire server really hung, or is it just Apache? Or maybe Apache waiting for your app to return something (not likely, as Apache is multi-threaded, so this shouldn’t affect other apps or web sites).

You’ll probably just need to wait until the problem occurs and then investigate with top.

I have seen Xojo apps take 100% CPU time and if several do that, all cores can be busy and not even SSH login will work well.

One thing to solve is to make sure all the web apps have a priority set to a low priority.