I have an XMLReader class that reads XLSX worksheet files and can use it to read the entire file into my application. I’m trying to implement a preview function using the same code and want it to stop after 10 rows worth of data. Spotting the 10th row is easy, however, I can’t seem to find a way of telling the XMLReader to stop reading and return what it has done so far.
I can see a Reset method but calling that from within StartElement causes a hard crash of the application. Has anyone a solution for putting the breaks on. I suppose raising an exception could be a way out but it seems a little extreme.
The XMLReader reads a FolderItem, yes. It does that by you calling the Parse method, after that you are within the XMLReader methods. Within that you have no access to the FolderItem object.
Either way if you are going to cause an exception I may as well just cause one myself. For now I’ve subclassed RuntimeException and captured that around the parse point. Allowing me to exit when I need to.
I notice that Expat (which XMLReader uses) does have a XML_StopParser, but it isn’t exposed to Xojo.
IIRC, Expat is fairly old. I know that MBS has an XML reader and I’d imagine it’s a lot newer. But I have no experience with it. And if you’re not an MBS user then sorry for the waste of bandwidth.
Make a subclass of the XMLReader class and override the Parse method like this:
Private Sub Parse(s As String)
Dim lines() As String = Split(s, EndOfLine)
While UBound(lines) > -1 and not mStopped
Super.Parse(lines(0), False)
lines.RemoveAt(0)
Wend
End Sub
Then whenever you get to the line where you want to stop, you set mStopped = True.
ok, so take the logic outside of the parser then. You could read the file externally using a TextInputStream and just feed it in chunks to the parser. or just override the Parse(f as Folderitem) method.
You still might want to try it. I just did it here with a 1MB file and it took 5166 microseconds to read the file and 0.16 seconds to parse the first 10 lines.
Well my exception method doesn’t seem to work. All the happens is that the exception is raised but there’s then a massive delay with nothing operational until file read is complete. Trying your scheme now. One slight wrinkle is that there is no EndofLine within the file so I’ll have to split it another way.
I was wondering about that also. That said. I need to read 10 rows of data. I can read the file until I hit </ row> of the 10th time. I can make the file complete by simply adding “</ sheetData></ worksheet>” (spaces added to make it visible in the forum) to the end of it. Everything should then be conformant.
I’ve just checked with a worksheet with 183000 rows in it and the file size reduces from 390MB to 25KB.
Maybe you need to annotate all open tags in order. And remove once they were closing. Once you reach your desired goal, you could just dump the remaining matching closing tags in order. So in case of receiving something “different” of the expected, with extra tags, the routine would, possibly, maintain it readable, not sure if consistent.
No given the only item reading this is my code I’m safe just adding the close tags. It will work fine. It also prevents any issues with my EndDocument method.