XML File. Parsing paragraphs and texts

Hello,

For example, I have this chunk of text from a XML file :

<text:p text:style-name="P1">Some text</text:p>

I am able to parse the paragraph elements with this code :

dim x as new XmlDocument
Var nodes As XmlNodeList
nodes = x.XQL("//text:p") // Find paragraph elements in XML

// Loop through results and display each name attribute
Var node As XmlNode
For i As Integer = 0 To nodes.Length - 1
  node = nodes.Item(i)
  if node.GetAttribute("text:style-name") = "P1" then
    MessageBox("Paragraph: " + node.GetAttribute("text:style-name"))
  end if
Next

But I don’t know yet how to get the sentence “Some text” in the paragraph “P1”.

Thank you in advance for your help.

I can’t try it without some full XML to test with, but perhaps you want something like this:

Var myText As String
myText = node.FirstChild.Value

When all else fails, step through your code in the debugger and look at the properties.

If you use the XmlReader Class with the events, it would simplify your parsing of ODT files. With the StartElement/EndElement Event you can filter <text:p/> and with the Characters Event you‘ll get your sentence.

https://docs.xojo.com/index.php/XMLReader

Hi Daniel,

You can use regular expression for that.
place a textfield and a button on your window and add this code to the action of you button:

Var reg As New RegEx
reg.SearchPattern = “<[^<>]+>”
reg.ReplacementPattern = “”
reg.Options.ReplaceAllMatches = True

Var xmlFile As String = TextField1.Text
Var result As String = reg.Replace(xmlFile)

MessageBox(result)

Hi Martin T,

I obviously saw the XMLReader.xojo_binary_project file in the Sample Projects folder. But a sample source code for the Characters Event is missing and I am a newbie in that domain.
Could you suggest a generic source code for this event ?

Thank you in advance

Hi @Daniel_BEGUIN,

I’ve created a small sample project. You can download it here. Enjoy :wink:

Forum for Xojo Programming Language and IDE. Copyright © 2021 Xojo, Inc.