Domain name search engine - Sample Project

Kayla_Gordon · July 17, 2015, 9:06am

I’m building a sample project and in this project I’m creating a web domain search engine. In a text field the user will input a domain name (to check its availability) and then select the extension (.com, .org, .net, etc) from a popup menu. When clicking on the search button, GoDaddy.com will be searched to check the domains availability. While searching, a progress wheel will initiate and the result (yes, it is available OR no it’s not available) will populate in a blank label next to the progress wheel.

I’ve looked at Xojo examples in the installation package, and also I’ve made some examples with HTML Viewer and TCP Socket. I’ve played around with a SOAPMethod example, but to be honest, I’m not sure which one is the best approach. I’ve also read about CURL as another option. Any suggestions?

shao_sean · July 17, 2015, 9:12am

Does GoDaddy have an API for this or are you just scraping their site?

Kayla_Gordon · July 17, 2015, 9:38am

Actually they do. Let me play with that some and I’ll come back if I have any questions. Thanks Shao!

Ashot_Khachatryan · July 17, 2015, 10:08am

You should use HTTPSocket/HTTPSecureSocket requests

Kayla_Gordon · July 17, 2015, 11:16am

I looked at the API for a bit, but I’m more interested in learning about web scraping, since I want to do that in the future. So I’ll take the web scraping route.

Kayla_Gordon · July 17, 2015, 2:09pm

Does anybody know how I could approach this by web scrapping?

Jean-Yves_Pochez · July 17, 2015, 5:30pm

if you want your program to work longer, then you should use the api if it exists
web scraping will stop working each time the webmaster relooks his website
so that can be (very) often and you have to modify your program also very often
an API is less subject to change.

Michel_Bujardet · July 17, 2015, 5:40pm

Start by getting HTTPSocket working fine at downloading pages. This also implies you learn how to send data to a web site that way. Which is not evident at all. Be ready for a lot of trial and errors.

Then you will need to learn how to fetch the significant parts into the HTML code. Which IMHO requires you get at least a superficial knowledge of the way HTML works.

You are in for a lot of work. I have experimented in that a while ago. See https://forum.xojo.com/10694-web-robot

An API is way simpler.

Ashot_Khachatryan · July 17, 2015, 6:34pm

[quote=201164:@Michel Bujardet]Then you will need to learn how to fetch the significant parts into the HTML code. Which IMHO requires you get at least a superficial knowledge of the way HTML works.

You are in for a lot of work.[/quote]

Regex makes it very easy though

Michel_Bujardet · July 17, 2015, 7:25pm

Indeed. Yet, that is only part of the whole thing.

Amando_Blasco · July 17, 2015, 9:00pm

I suggest you to follow the API path and avoid web scrapping as much as you can. It’s evil…

Tim_Hare · July 17, 2015, 9:22pm

Always use a published api over web scraping.

scott_boss · July 17, 2015, 11:56pm

Dont web scrape. They will change the page and your app will break. its a never ending cycle and you will pull all your hair out. trust me on this one.

there are various DNS APIs out there, I would find one and use their API. APIs tend not to change very often and generally changes are announced so you can prep for them. Whereas website changes are done whenever the webmaster/mistress desires. Generally with zero announcements before hand.

Michel_Bujardet · July 18, 2015, 8:46am

To be fair, Web scrapping can have it’s uses, when for instance one needs to repatriate a lot of data that exist only as web pages, for instance for some academic study or survey. Then it can greatly speed up the process over copy and paste. But we are talking usually a one off circumstance.

For an app such as describe by the OP, web scraping is a recipe for pain and suffering. Not to mention possible copyright infringement.

Ashot_Khachatryan · July 18, 2015, 10:41am

APIs are usually limited in a few ways. Request amount limits, data restrictions, etc, so web scraping works better in most of the cases.

I don’t think most websites would do something legal if they detected someone scraping. Most probably uthey will use a firewall to block access to the website, or use recaptcha.

Michel_Bujardet · July 18, 2015, 11:49am

Possible, until the format of the page breaks the scrape. It is just like driving at the speed limit will get you safely there, whereas driving too fast will get you to the curb

They will probably not go to court, but they may request the app be removed from the MAS for copyright infringement, and most probably will get satisfaction. Or as you say just use technical counter measures.

If an API is available, why chance it ?

Once again, I am not against scraping per se. For an app that must be reliable, it is a bad idea IMHO.

Ashot_Khachatryan · July 22, 2015, 6:48am

Most websites don’t change the format for years. I have made some scrapers in 2012 but still work perfectly!

Not everyone releases their app publicly, and most of the time, bypassing their technical counter measures is easy. Like using deathbycaptcha for captchas, simulate a browser with the headers, and even use a browser like PhantomJS to execute JS.

Because sometimes the API doesn’t provide the info you need. For example youtube has a very good documented api, with which you can search and get info about the result videos. But that part of the API doesn’t provide the full description, which you may need. So scraping via the website is a better choice.

Michel_Bujardet · July 22, 2015, 8:53am

[quote=201822:@Ashot Khachatryan]Most websites don’t change the format for years. I have made some scrapers in 2012 but still work perfectly!

Not everyone releases their app publicly, and most of the time, bypassing their technical counter measures is easy. Like using deathbycaptcha for captchas, simulate a browser with the headers, and even use a browser like PhantomJS to execute JS.

Because sometimes the API doesn’t provide the info you need. For example youtube has a very good documented api, with which you can search and get info about the result videos. But that part of the API doesn’t provide the full description, which you may need. So scraping via the website is a better choice.[/quote]

Ashot, you missed my point. I do not believe scrapping should not be done, or that API is the best. In fact, each has its own advantages. None should be dismissed outright without looking into these differences.

You also got to keep in mind that the OP does not have a lot of experience. A clearly document API may be easier to implement for her.

Unless someone more acquainted with scrapping like you creates a sample project for her to learn from…

Ashot_Khachatryan · July 22, 2015, 10:43am

[quote=201839:@Michel Bujardet]Ashot, you missed my point. I do not believe scrapping should not be done, or that API is the best. In fact, each has its own advantages. None should be dismissed outright without looking into these differences.

You also got to keep in mind that the OP does not have a lot of experience. A clearly document API may be easier to implement for her.

Unless someone more acquainted with scrapping like you creates a sample project for her to learn from…[/quote]
I did not miss your point, I agree with you. Of course using the API is the best choice in most cases. I was just providing some more info about scraping without an API.

Tomas_J · July 22, 2015, 1:00pm

Kayla just a suggestion: A simple nslookup shell command would be more relieable than any alien API or Web Scrap techniques. Keep in mind that domains still can be registred but remain unavailable for http Website access.