How to parse HTML Code on Xojo.

Hi all!
I catch the Source code of a website using Javascript with HTMLViewer, and in the code, I need to get the “Onclick” attribute of label, like this: ">

The issue. is that obviously this value always be different, because theres a lot of links on the source code.
I thought on InStr. but I think Instr Only work for a certain text, wright?

Any ideas?

Pseudo code:

Dim source as String = [YOUR SOURCE]
s = NthField(NthField(source, "onclick=""", 2), ";""", 1)

You do know that you can easily get the HTML source of a page using a socket instead of getting it from an HTMLViewer? :wink:

[quote=231516:@Albin Kiland]Pseudo code:

Dim source as String = [YOUR SOURCE]
s = NthField(NthField(source, "onclick=""", 2), ";""", 1)

You do know that you can easily get the HTML source of a page using a socket instead of getting it from an HTMLViewer? ;)[/quote]
Sure? Is it possible. Wow, How did you do get the source with a socket?

Thanks

And How an I do a loop Everytime I see the same label in order to make an array?

  Dim sock as New HTTPSocket
  Dim s as String
  
  s = sock.Get("http://www.xojo.com/", 60)
  MsgBox s

This is also pseudo code, but it might work.
Inspect ‘resultArr’ array on the ‘break’.
source = your source text

  Dim count as Integer
  Dim resultArr() as String
  
  count = CountFields(source, "onclick=")
  
  For i as Integer = 1 to count -1
    resultArr.Append NthField(NthField(source, "onclick=""", i+1), ";""", 1)
  Next
  
  Break

[quote=231522:@Albin Kiland]This is also pseudo code, but it might work.
Inspect ‘resultArr’ array on the ‘break’.
source = your source text

[code]
Dim count as Integer
Dim resultArr() as String

count = CountFields(source, “onclick=”)

For i as Integer = 1 to count -1
resultArr.Append NthField(NthField(source, “onclick=”"", i+1), “;”"", 1)
Next

Break
[/code][/quote]
Thanks man!!! you are awesome!!

This can probably be done more efficient with regex but I’m no expert on that beast :slight_smile:

Now if the Word “Return” is present and I want to replace it by “Hooray”, I made this:

resultArr.Append Replace(NthField(NthField(txtsource.text,filtro, i+1), ";""", 1),"return","HOORAY")

That pre-supposes that the HTML one is dealing with has come from a web site. Suppose it comes from an email? Is there a method I can use to get the serialised HTML back from an HTMLViewer? At least that would give HTML that webkit had cleaned up and corrected, thus making it more reliable to parse.

Parsing html is done way easier with the Tidy plugin from the MBS plugins. With this you can clean up the html in one line of code.

In Regex it would be something like that:

\\<[a-zA-Z \\=\"0-9\\/\\-\\.]+ onclick=\"([a-zA-Z0-9 \\/\\'\\(\\)\\+\\.\\?\\=\\,\\;]+)

The match would be in SubExpressionString(1). Please note that this is just quickly put together and might not be 100% accurate. :wink:

did you see the HTMLViewer extensions in MBS Plugin?
They allow you to execute javascript on the page.

[quote=231516:@Albin Kiland]Pseudo code:

Dim source as String = [YOUR SOURCE]
s = NthField(NthField(source, "onclick=""", 2), ";""", 1)

You do know that you can easily get the HTML source of a page using a socket instead of getting it from an HTMLViewer? ;)[/quote]
Of course I know that is possible to get the source code with an HTTPSocket.
In my case I need an HTTPSecureSocket, and pass the cookie to the HeaderRequest.
Doing this, Yes I get the source code.

But When I get source code with socket. I only get a part of code.
When I get through javascript. I get all the source code including javascript

[quote=231560:@Christian Schmitz]did you see the HTMLViewer extensions in MBS Plugin?
They allow you to execute javascript on the page.[/quote]
Are there examples of How-to use In the Folder “Examples”?

Thanks

sure. Did you check the htmlviewer examples there?
In Cocoa, Linux or Win folders?

[quote=231541:@Beatrix Willius]Parsing html is done way easier with the Tidy plugin from the MBS plugins. With this you can clean up the html in one line of code.[/quote]in which folder are Tidy plugin examples?
Thanks

tidy is a nice little command line utility on OS X

There is an example on my website: http://www.mothsoftware.com/content/xojo/ .

maybe look here to see Tidy example projects online from MBS:

https://www.monkeybreadsoftware.net/plugins-mbstidyplugin.shtml