While my app is running in the debugger (have not tried standalone builds yet) it is crashing at seemingly random times with crash logs that look like this:
This sure looks to me like a nasty failed assertion (I’m sure you had a hand in this, @Norman Palardy! ) in the XML plugin in Xojo. The seeming randomness of this I’m pretty sure is because I have a thread doing some XML transaction stuff with a remote server running in the background. The crash happens when the XML processing in the thread pukes.
If I knew where in the XML plugin it is dying, I could post a narrow snippet of my code around where that is called… but this thread is handling the synchronization of data between 2 databases, so there is quite a bit of code (7 separate methods in this thread, about a dozen more in a module it uses, plus the run event in the thread itself) in the thread… probably too much to effectively just shotgun here (plus, this is a commercial project, and the code is not particularly… public). Perhaps someone at Xojo ( Hi, @Norman Palardy !) can look up what is going on at line 995 of /var/lib/buildbot/slave/QuickStableXCode/build/REAL Software Plugins/Xml Plugin/Sablot-1.0.1/src/engine/output.cpp and give me a clue about that XML function leads there. That will at least help me narrow down where to look.
Please file a bug report in Feedback with the crash log attached or post the crash log to something like pastebin. Partial crash reports aren’t very helpful.
I’ll take the blame for a lot of stuff - but not this one
Thats way down deep in the bowels of some C++ code which I rarely touch (in all my time here I think I’ve fixed a handful bugs in our C++ code)
Ok, after 2 days of bisecting code I believe I’ve found the exact root cause, and I believe it is a good case for why using assertions (especially in production code) is not a great idea for what should have been handled by error checking. In short, the bug I’ve spent 2 days chasing appears to be because the XML parser in Xojo simply barfs with a failed assertion if you send it a string to add to a node that consists of bytes that are not string-like. Instead of failing an assertion, it should gracefully recover in any number of ways (throw an exception with useful info as to the error, put in a blank string, offer some sort of error condition that can be checked to verify operations, etc). Simply barfing at runtime on a failed assertion rather than handle this better has cost me 2 days, and a fix won’t happen for at least several months (assuming it gets high enough priority) to several years (or maybe never.)
Here is the line of code that is causing my entire app to come crashing down:
So, what is happening here? I’ve got an XMLDocument called rowNode that I’m stuffing things into. In the case of this line, I’m walking through the results of a select from my database and pulling out the string values of the columns in the table and putting them into the node.
The problem occurs when one of the values in the database is basically binary data rather than a recognizable string. rs.field(“foo”).stringValue still returns a string object, but the bytes of that string look like this:
In other words, it’s not an intelligible string at all, and is in fact the result of some other error long ago in this particular customer database that mucked up this specific row in this table. So I need to add some error checking to find and remove data that may have become corrupted in similar fashion before throwing at the XML parser. But here’s my concern: In what other ways will it just throw a failed assertion? What other failure conditions have been pushed out to me rather than handled in the XML framework? There is no way to know, and I only happened to catch this one because I was testing against a specific database that had this specific corrupted row.
In other news, does anyone know of a good way to check to see if a string that comes out of RecordSet.field(“name”).stringValue is in fact a “valid” string?
Now you can use Encoding.Name.IsValidData
All that WILL tell you is “you could take that data and define this data to be this encoding”
It is NOT a “guess what encoding this data might be” kind of method
That data would be valid in any single byte encoding - whether it’s “correct” is an entirely different question
And I’m still a proponent of “die early, die often, die hard”
Much of the code in the XML plugin IS not ours and from an open source project (Sablotron) so the issue could be way down inside that
But I’d bet a sample could be put together now that we know whats causing this
I thought the default encoding for all strings in Xojo (for several years now) is UTF-8. In the debugger, this string (as returned by the StringValue() method appears to pop out of the database with UTF-8 as the encoding.
I’ll try the isValidData method.
Exactly. That’s why I kept referring to it as “valid”. Bytes are bytes. Strings are only bytes as interpreted in the context of their encoding.
[quote=78955:@Kimball Larsen]I thought the default encoding for all strings in Xojo (for several years now) is UTF-8. In the debugger, this string (as returned by the StringValue() method appears to pop out of the database with UTF-8 as the encoding.
[/quote]
It is but this data is NOT originating from anything IN Xojo or your app
If it comes from a serial tcp database file etc its “foreign” and probably needs a define encoding
I’d double check your data before shoving it into the XML child and see but I’d not be surprised if it has a nil encoding
XML is very finicky about what characters you stick in it. It will definitely barf on binary data in a text node. About the only place you can put binary data is in a CDATA section. Otherwise, XML just loses its mind.
It really is incumbent on the programmer to validate the data before handing it to xml. The kind of code you posted above is very dangerous.
My M_String package has a method that will analyze a string and return the best encoding it can. Another method will strip invalid bytes from a UTF-8 string, if either of those will help you.
I’d agree that assertions shouldn’t be used for error checking, but having assertions in production code is a good thing, in my opinion. Assertions check that invariants hold true and if the assertion fails, the program is in an unknown state and it’s better to fail early than silently corrupt the XML document.
This appears to be true and an exception definitely should be raised before the information is passed off to the XML engine.