Possible Xojo.Core.Dictionary bug

Thomas_Tempelmann · August 29, 2016, 7:46pm

What do you mean? Shall I open another bug report for all to see?

Michel_Bujardet · August 29, 2016, 7:47pm

Do you know how to add to a case ? Easy : Click on the pencil. You are in the beta group, aren’t you ?

What is important is that the bug be reported, so it has a chance to be fixed.

This thread is basically enough for public debate, is it not ? There is a lot more here than in a small report anyway.

I love it. Nobody had any courage to file that pesky bug report, but now complaints come that it is not right. Pff.

Eli_Ott · August 30, 2016, 4:30am

[quote=284277:@Kem Tekinay]Eli, I must be missing something. You asked for opinions, which is fine. I do that often too. But then you got opinions that said, “yes, this is a bug,” including from @Joe Ranieri. He won’t know for sure until he actually looks at it, and he will only remember to look at it once there is a Feedback request.

What I’m asking is, what more do you need to file a report?[/quote]

It is not my duty to file a bug report just because I stumbled over something like this. I can do it or not. It is up to me. No offense but I don’t know what your question is about. This is not an open source project I have committed to spent time on. I already lost hours until I found the reason why this was not working as expected and then I had to find a workaround. And this with a product I paid for.

b) I didn’t understand this as a confirmation that it is a bug (“believe”):[quote=283823:@Joe Ranieri]I’m not the programmer in question, but I absolutely believe it’s a bug[/quote]
What does “absolutely believing” mean anyway?

If this is really a bug, it is kind of devastating. Something so basic being so flawed years after the new framework has been introduced is more than just annoying. It is not the first time that I think Xojo Inc. does either not have decent unit test sets or does not do any unit tests at all…

Kem_Tekinay · August 30, 2016, 4:50am

I guess this is a language difference. “Absolutely believe” in that context meant Joe was convinced that this was, indeed, a bug.

As for the rest, it makes no sense to me. You spent hours tracking it down, but decided to stop short of doing the one thing that would get the problem fixed for all of us? Yes, it’s a bad bug, but bugs happen and bad bugs happen, and we all work together, Xojo and its customers, to make the product better. It would have taken you less effort to file Feedback than to report it here, so I really don’t understand your objection.

No matter, I’m not really interested in discussing it further. Michel has taken the steps needed and for that I’m grateful. However, I will remember in the future when you ask for help that you were not really interested in helping us.

Eli_Ott · August 30, 2016, 6:15am

To me this means hat he suspects it is a bug but he doesn’t know.

I really do not know what is problematic with my point of view in this case. I was helpful by reporting it here. I just could have said nothing since I had found a workaround.

In this case for me “bug” means not only “possible programming bug” but also “possible language design bug (or flaw)”. In other words, is this intended behavior or not? This question remains open in this thread even when there is a tendency towards it being a “programming bug” by most participants. But as long as this is not answered clearly in this thread, I do not file a bug report. I’m used not only with Xojo to dicsuss such stuff in the respective forum and to file bugs only after they are confirmed. I do not see any wrong in doing this. And I have done it here that way since 2010 and never got criticized for it.

I do not understand your last comment. I was helpful.

Andre_Kuiper · August 30, 2016, 9:28am

I agree with Eli, halfbaked approve is still disapprove.

Absolutely nothing!

Michel_Bujardet · August 30, 2016, 12:03pm

As a matter of precaution, reporting a possible bug is always a good idea. If it is not, then you get a “by design” reply and you move on. If it is indeed a bug, you know it is the pile of things to fix. At any rate it does not hurt.

I feel it is in my own interest to report bugs. After all, do I really want an unreliable Xojo.Core.Dictionary ?

Now, everybody sees things in his own way.

Andre_Kuiper · August 30, 2016, 3:59pm

And everybody deserves respect for his/her POV!

Thanks Michel for your efforts.

Norman_P · August 31, 2016, 6:40pm

We’ve isolated the cause and good news is that … its not OUR bug
At least on OS X it’s Apples bug in their string comparison mechanism.
So we just have to figure out how to work around it / fix it for our usage and maybe eventually Apple will fix their bug.

However I would encourage EVERYONE to report what they find that seems like a bug (like this) regardless of whether they have discussed it with other forum members.
Things that go unreported mean they are unlikely to ever get reviewed.
And if it IS a bug then we never get around to fixing it.
If its not a bug then we’ll mark the report that way and everyone can carry on.

NOT reporting bugs/oddities is a great way to make sure they never get fixed.

Thomas_Tempelmann · August 31, 2016, 7:11pm

Norman, can you provide some details, please? Because, as I wrote earlier, if you cannot guarantee that your hash function for the dict keys uses the same conversion as the comparison function to look them up, then it’s actually your fault as you’re implementing the dictionary algo wrong.

Kem_Tekinay · August 31, 2016, 7:15pm

Norman, does that mean that this bug only shows up on the Mac?

Norman_P · August 31, 2016, 7:19pm

We expect the comparison to be “stable” (ie. if a < b then b > a)

Internally we use CFStringCompareWithOptionsAndLocale in a manner like the following to determine which key is “larger”


int main() {
CFStringRef str1 = CFSTR("Kota");
CFStringRef str2 = CFSTR("Knig");

CFStringCompareFlags flags = kCFCompareNonliteral | kCFCompareForcedOrdering;

CFIndex cmp1 = CFStringCompareWithOptionsAndLocale(str1, str2, CFRangeMake(0, CFStringGetLength(str1)), flags, NULL);
CFIndex cmp2 = CFStringCompareWithOptionsAndLocale(str2, str1, CFRangeMake(0, CFStringGetLength(str2)), flags, NULL);

printf("%li, %li\
", cmp1, cmp2);
}

Note that the results would suggest that “Kota” < “Knig” AND “Knig” < “Kota” (ie a < b and b < a)
Thats kind of hard to fathom
We’ll likely have to do something to work around this issue

So far we have only tracked this issue down on OS X
I have not investigated whether this also occurs on Windows or Linux
Since they use different underlying libs I’d be surprised if it does

Thomas_Tempelmann · August 31, 2016, 7:24pm

Norman, I fail to see how a dictionary key insertion has any reason to use relative comparisons.

A dictionary should only do two things with the keys: Create a hash from the key string, use the hash to place it into the array of dict entries, and then again use the key to compare the list at the hashed dict entry to decide if the key is already in that list - that latter comparison is only an “equals or not” comparison, not one for relations (larger or smaller).

Could you please explain why this is used in your implementation? Are you trying to manage an ordered map here?

Also, how does the code for hashing the key generation look like?

If you prefer, contact me privately, we don’t have to drag this out here.

Thomas_Tempelmann · August 31, 2016, 7:36pm

Also, doing the dict locale-sensitive is a bad idea. It would mean that if I created a dictionary on one Mac and then sent the generated data to another Mac that tries to read it back into a dictionary, it’ll get different results if the Macs don’t use the same locale. That’s pretty unexpected, I’d say and could lead to hard-to-detect bugs in an app, including data loss.

I’d totally stay away from this kind of dict unless I had better control over its comparison behavior, so that I can make it user-locale-independent. Anything else is only asking for trouble in the long run.

Kem_Tekinay · August 31, 2016, 7:43pm

I think the new dictionary is a red-black tree, if that clears anything up.

Thomas_Tempelmann · August 31, 2016, 7:46pm

Kem, good catch. That would explain a lot - apart from the fact that it gives the wrong impression about its performance.

Plus, it does not change what I say about the locale-dependent behavior, which is a no-go, IMO.

Michel_Bujardet · August 31, 2016, 7:47pm

[quote=284655:@Thomas Tempelmann]Also, doing the dict locale-sensitive is a bad idea. It would mean that if I created a dictionary on one Mac and then sent the generated data to another Mac that tries to read it back into a dictionary, it’ll get different results if the Macs don’t use the same locale. That’s pretty unexpected, I’d say and could lead to hard-to-detect bugs in an app, including data loss.

I’d totally stay away from this kind of dict unless I had better control over its comparison behavior, so that I can make it user-locale-independent. Anything else is only asking for trouble in the long run.[/quote]

How do you evidence locale sensitive ? Or is it the small post framework function name from Norman that tells you that ?

Thomas_Tempelmann · August 31, 2016, 7:48pm

Are you saying it is not locale-sensitive? Well, if that’s the case, then we’re good. I just guessed it from looking at the code, but I may have guessed wrong. Are you sure it’s locale-insensitive?

Norman_P · August 31, 2016, 7:50pm

That IS at the root of the issue and has lead us to knowing what we need to do to resolve the issue.
And also to report a bug to Apple as the CF methods are returning a result that is erroneous.
We’ll figure out how to make this work.

Thomas_Tempelmann · August 31, 2016, 7:52pm

BTW, if it’s a red-black tree, as Kem suggests, it means that the dict behaves rather inefficiently when it comes to looking up complex text in a large “dict”, because then every node requires a comparison, which will be rather costly in time compared to a true “map” like dictionary that will only have to calculate one hash code and then pick a single match in most cases, though it’ll be a bit more wasteful in memory consumption (but memory’s cheap nowadays, especially with 64 bit support, isn’t it?)

Correct me if I’m wrong.