Large XML File Issue

I am trying to read in a large XML document generated by another application. The XML file is 2.33GB in size and causes, perhaps unsurprisingly my application to crash. I have been experimenting with the XMLExample project to find ways of loading the XML file. I have used smaller XML files from the same application generating the XML without any issues, and the large files look correctly formatted, so I think the issue is the size. I have tried…

Method 1
XMLDocument.LoadXML with the file (FolderItem) and this raises the exception…
XmlException Code 69, Message msg:unknown encoding ‘’
Smaller XML files generated from the same application work fine with this method.

Method 2
Using a TextInputStream (with the aim of using a String with XMLDocument.LoadXML)
When I call the ReadAll method, the application crashes with…
EXC_BAD_ACCESS (SIGSEGV)
In the stack, the method looks to be - com.xojo.XojoFramework 0x001f15f6 RuntimeTextFromOldString + 326

Method 3
Using a BinaryStream to read the file into a String, but this crashes at about 1GB (it is read in blocks of 25MB) with the same crash details as above.

I am guessing that a String can not hold this much data. Does anyone have an idea of how I might load XML files this large ?

I am running Xojo 2015 r1, on a MacBook Pro Retina with 16GB RAM.

Thanks for your help.

Andy

[quote=183898:@Andrew Voller]I am trying to read in a large XML document generated by another application. The XML file is 2.33GB in size and causes, perhaps unsurprisingly my application to crash. I have been experimenting with the XMLExample project to find ways of loading the XML file. I have used smaller XML files from the same application generating the XML without any issues, and the large files look correctly formatted, so I think the issue is the size. I have tried…

Method 1
XMLDocument.LoadXML with the file (FolderItem) and this raises the exception…
XmlException Code 69, Message msg:unknown encoding ‘’
Smaller XML files generated from the same application work fine with this method.

Method 2
Using a TextInputStream (with the aim of using a String with XMLDocument.LoadXML)
When I call the ReadAll method, the application crashes with…
EXC_BAD_ACCESS (SIGSEGV)
In the stack, the method looks to be - com.xojo.XojoFramework 0x001f15f6 RuntimeTextFromOldString + 326

Method 3
Using a BinaryStream to read the file into a String, but this crashes at about 1GB (it is read in blocks of 25MB) with the same crash details as above.

I am guessing that a String can not hold this much data. Does anyone have an idea of how I might load XML files this large ?

I am running Xojo 2015 r1, on a MacBook Pro Retina with 16GB RAM.

Thanks for your help.

Andy[/quote]

You may want to load the XML in chunks of less than 1GB with BinaryStream, and put the content in a database, SQLLite probably will suffice.

Thanks Michel for your reply. How would you then pass the contents of the database to the XMLDocument class ? It appears that a String can not hold this much data to pass via XMLDocument.LoadXML and loading from the database to a string will have the same problems I have loading from a file I assume. I will give it a try though. Thanks.

In the ancient ages, when strings had only 65536 or even 255 characters, we used windowing to manage longer text files. There was never in memory more than what was necessary to work.

I believe you can do the same thing : upon load, you get the XML chunk by chunk, just making sure each chunk ends up with the end of a structure, and place each field in a database with the same kind of structure as the XML

Upon save, you can probably use the same principle : take x amount of records, build an XML structure, and send it to a TextOutputStream one chunk after the other.

The only thing is that the save will not be identical to the original file, since the way Xojo generates XML may differ from the original presentation. But as long as all the fields are there, it should not be an issue.

At least you can try.

A string can hold a much data as you want
BUT a 2.3 GB string + your running application probably exceeds the amount of memory the process can access.
Use the SAX style XMLReader - http://documentation.xojo.com/index.php/XMLReader