I have been using XMLDocument for quite awhile in RB/Xojo and feel pretty comfortable with it. Recently I encountered a task where I needed to read a very large XML document (well, at least large to me) @ 812MB. I figured XMLDocument was going to choke on that so I switched to XMLReader. I am getting what I want with smaller sample files, but when I .parse against the larger one it dies pretty much instantly with an out of memory error. The individual records themselves are pretty small… circa 16 elements, but I guess that does not/should not matter. I was assuming that almost noting would be retained in memory, and I see non Xojo Expat users only starting to complain about memory at multi gig files. I am using XOJO 2014r1.1 on OS X. I can probably hand chop this file into parts but would rather understand a limitation or find a solution as I will enounter this again. Thanks in advance.
Well, encoding made no difference. For what it is worth I can’t load this file in several java based xml editors without also getting an out of memory error. Maybe for my purposes I just roll my own with textinputstream.
I’m having exactly the same problem with a 600Mb file. The wrinkle with my issue is that it works fine on Mac OS X, but throws the “outOfMemory” exception immediately when running in Windows.
I’m loading render data xml’s and I’ve had no problem
[code] #if TargetWin32 then
dim fr as folderItem=GetFolderItem(Main.Savetxt.Text +“temp.xml”,FolderItem.PathTypeNative) #elseif TargetMacOS
dim fr as folderItem=GetFolderItem(Main.Savetxt.Text +"/"+“temp.xml”,FolderItem.PathTypeNative) #endif
dim xDoc as new xmlDocument
try
xDoc.LoadXml(fr)
catch
xDoc=nil //an error occurred
end try
dim SelectCamera as String
dim myFocalLength as String
dim Resolution as String
dim names() as String
dim values() as String
if xDoc<>nil then
dim xq as XmlNodeList
dim xt as XmlNodeList
xq=xDoc.DocumentElement.Xql(“Object/Object[@Type=‘Camera’]”)
xt=xDoc.DocumentElement.Xql(“Object[@Type=‘Scene’]”)
dim i,n as integer
n=xq.Length-1
Nige, we are not talking about the XML DOC/DOM model, but instead, the event driven reader.
And the issue is not if it works, but if it works on very large files, which is what it is supposed to be good for (amongst other things).
My “solution” to my original problem was to create my own file reader, which would read off blocks of characters until it found a top level element, and then passed those elements into a standard xmlDocument.
This: [quote]The strength of the SAX specification is that it can scan and parse gigabytes worth of XML documents without hitting resource limits, because it does not try to create the DOM representation in memory.[/quote] is from an [/code]]article on DexX about the SAX and DOM parsing approaches.
Is it possible that there is something in your event handlers that is causing the problem, Todd? Have you tried “parsing” the file with nothing in any of the reader’s handlers? If it still fails, it sounds like a feedback-worthy report to me.
Ok, I’ll give it a shot… as luck would have it the customer wants the data massaged in a different way so I will be looking at it again next week. If it holds then I will be happy to provide the sample file. It is all Danish hockey and football data so I don’t think there is a national security issue.