Identifying bot traffic

How can I identify and reject bot traffic on a Xojo Webapp?

I have a webapp which tracks downloads of a Xojo desktop app and now I received a warning from the geolocation service about exceeding of quota. Today suddenly the downloads exploded and these are certainly not real users.

If this is one of your Lifeboat configured servers you can find the nginx access log at

/var/log/nginx/{yourdomain}

Lifeboat will only show you the error.log in the app, the access log has successful requests in it.

I am interested in this topic. If there is something I can add to Lifeboat to help, I would love to.

Adding a service like Cloudflare in front of your web app can help.

Google’s reCAPTCHA should work fine also. Latest versions doesn’t even asks the user to interact with the Captcha, it can detect bots automagically.

As an alternative, the download can be behind an email field. You can send the user a unique URL by email, with their download link.

1 Like

At the server level, a way to visualize those “successful requests”, organzed by geographical origin and an easy to use firewall setting to block traffic from Any foreign countries or Specifc countries?

That sounds handy, but also like a huge task :sweat_smile:

I’m interested specifically in ways to detect and/or block these requests, not so much the ability to parse and understand them as a human.

fail2ban is installed by Lifeboat, and the default config is working great to block malicious ssh attacks (I checked my logs this morning to confirm!) I have investigated a little bit about how to integrate it for malicious HTTP requests, but my research usually leads me to premium nginx features.

In your HandleURL event handler, you could add something like this:

If Request.Header( "User-Agent").Contains("bot") Then
  Response.Status = 401
  Return True
End If

This is a very basic way to block bots, but it does seem to help. It is only successful at blocking bots that actually identify themselves as being bots - meaning that they have the keyword “bot” in the User-Agent header.

2 Likes

Yes, exactly: this is on a lifeboat server.
Webstats are able to differentiate visitors from spiders. google bots and such, so I think this is doable. Rejecting web requests which are most likely not real users.

I don’t want to ask for email or use hCaptcha and the like, if I find a better way.

But this has time until tomorrow. Maybe there is a solution which could be easily set up with nginx? I’ll find out and report back.