raw Insert/modify xmlnode

  1. 2 weeks ago

    Hello,
    I'm trying to modify several TMX files.
    TMX files are xml file dedicated to translations interchange.
    Inside translations there are tags named as <bpt><ept>. these tags doesn't gets escaped inside the xml.

    Example: I have a node, called "seg", that can be like that:
    case1:

    <seg>Hi, I am a text</seg>

    or like that:
    case2:

    <seg>
        <bpt i="1" type="314" x="1" />123<ept i="1" />This is a text
    </seg>

    (or a combination of the two)

    Everything that is inside <seg> isn't actually considered as part of the xml structure, but rather textual content, where < and > are not escaped as they should. This is common in this kind of file.

    So in case1 issuing a seg.value = "hello" works, while in case2 i get an xml exception, of course, because there's a structure under it.
    Also I would like to issue something like seg.value ) "<bpt i="1" type="314" x="1" />123<ept i="1" />hello World". and have it updated without < and > encoded.

    Considering i may have any combination of any tag inside my text, is there any way to convert a string representation of a xmlnode to a xmlnode object? Or have a XMLNODE.stringcontent = "xxxxwhatever" function?. Basically what I'd need is a reversed function of xmlnode.tostring.

    Thank you very much to anybody who will read and help!

    OK, I was able to write a function to parse the text and insert it as xmlnode. I parsed the text in a new empty xml, then selected the resulting node, then looping on all its children and recursively checking the child type I replicated nodes, values and attributes on the main xml document.

    Thank to this is possibile to have text inserted as xmlcode inside an xmldocument object.

    Thank you for you help!

  2. Edward P

    Nov 29 Pre-Release Testers Tampa, FL, USA

    You are saying that in case 2, the "bpt" and "ept" tags are not part of this XML document's structure, but rather are data for the <seg> tag? In your example, any XML parser is going to treat <bpt /> and <ept /> as sub-nodes of the <seg>.

    If you want the literal <bpt i="1" type="314" x="1" />123<ept i="1" />This is a text to be the text value of <seg> and not get "&gt;" and "&lt;" substituted for the "<>", you will have to make that part a CTEXT element, not a regular text element. Something like this...

    Dim xDoc as New XmlDocument
    Dim seg as XmlNode
    seg = xDoc.CreateElement("seg")
    xdoc.AppendChild(seg)
    Dim segData As XmlNode
    segData = xDoc.CreateCDATASection("<bpt i =""1"" type=""314"" x=""1"" />123<ept i=""1"" />This is a text")
    seg.AppendChild(segData)

    The resulting XML from the above is

    <?xml version="1.0" encoding="UTF-8"?><seg><![CDATA[<bpt i ="1" type="314" x="1" />123<ept i="1" />This is a text]]></seg>

    To retrieve it

    Dim s As String = seg.firstchild.value

    And that will return

    <bpt i ="1" type="314" x="1" />123<ept i="1" />This is a text

    That's the only valid way to handle it without escaping the "<>" characters, especially if the resulting documents are going to be parsed against the TMX schema or by a program that is assuming a well-formed TMX document.

  3. Hello and thank you Edward for your reply,
    I know what you say, but the fact is I'm dealing with many many of these files, created by other (industry standard) software. The software expects them to be that way.

    I was hoping to parse the text before updating and then updating it as xml nodes. Something like (i'm inventing a method here ;-) ):

    dim x as xmlnode
    x = xml.parsexmlfragment("<seg><bpt i="1" type="314" x="1" />123<ept i="1" />This is a text</seg>")

    and then inserting it the way it is. Do you think something like that my work?

    This is because if I treat everything like xml nodes/subnodes this means I would have some nodes with subnodes and text mixed together, and I'm not sure this works in xml.

    the other solution I may have is to insert everything as <![CDATA and then parse the xml as a text file removing the <![CDATA, but some files are over 200MB...

  4. If i write this:

    dim xd1 as XmlDocument
    dim xmldeclaration, mysegment  as string
    
    xmldeclaration = "<?xml version=""1.0"" encoding=""UTF-8""?>"
    mysegment = "<seg><bpt i =""1"" type=""314"" x=""1"" />123<ept i=""1"" />This is a text</seg>"
    
    xd1 = new XmlDocument(xmldeclaration+mysegment)
    
    msgbox(xd1.tostring)

    I get this xml:

    <?xml version="1.0" encoding="UTF-8"?><seg><bpt i ="1" type="314" x="1" />123<ept i="1" />This is a text></seg>

    so what I'd need now is a method to transfer/copy/clone a xmlnode from an xmldocument to another xmldocument.

    Something like

    dim xDoc1,xDoc2 as XmlDocument
    dim xmldeclaration, mysegment  as string
    dim xNode1,xnode2 as xmlnode
    
    xmldeclaration = "<?xml version=""1.0"" encoding=""UTF-8""?>"
    mysegment = "<seg><bpt i =""1"" type=""314"" x=""1"" />123<ept i=""1"" />This is a text</seg>"
    
    xDoc1 = new XmlDocument(xmldeclaration+mysegment)
    xNode1 = xDoc1.xql("//seg").item(0)
    
    xDoc2 = new xmldocument
    xNode2 = xDoc2.CreateElement("seg")
    
    xnode2 = copydatafrom(xnode1, xnode2) //This is the method I need to perform the copy from node of document1 to node of document 2
    xDoc2.AppendChild(xNode2)
    
    
    MsgBox(xdoc2.ToString)

    Any help on how to proper loop over a xmlnode and clone it to another xmlnode in another xmldocument?

  5. OK, I was able to write a function to parse the text and insert it as xmlnode. I parsed the text in a new empty xml, then selected the resulting node, then looping on all its children and recursively checking the child type I replicated nodes, values and attributes on the main xml document.

    Thank to this is possibile to have text inserted as xmlcode inside an xmldocument object.

    Thank you for you help!

or Sign Up to reply!