I need webscrapping, but there is a library in Xojo for Webscrapping?
FWIW, scraping is usually very specific to the site & pages you are extracting data from.
Perhaps an explanation of what you are trying to do would help us help you…
I’ve done quite some scraping…:
- write the HTML page to disk using HTTPSocket and TextOutputStream
- Convert HTML to text with ‘textutil’ (OSX only)
- Write specific handlers for the text you are looking for (i.e. looking for specific tags or markers)
- Validate the found text (make sure it is the text you want)
- Write the found text fragments to a table
In this fashion I managed to page through 40.000 bulletin board pages, taking about a half a day on a fast Mac.
If your needs are simple it’s quite easy to do. If RegEx will work and you only need certain pages then you can just keep a list of RegExes and URLs and retrieve and search in a simple loop. I’ve done this before for basic notifications.
If you want to be the next Google, then a lot more work will be required.