Possible Xojo.Core.Dictionary bug

[code] Using Xojo.Core
Using Xojo.Introspection

Dim objects() As Object

objects.Append New TestClass(“Kota”)
objects.Append New TestClass(“Knig”)

Dim ti As TypeInfo = GetType(objects(0))
Dim pis() As PropertyInfo = ti.Properties
Dim pi As PropertyInfo = pis(0)

Dim buckets As New Dictionary()

For i As Integer = 0 To objects.Ubound
Dim obj As Object = objects(i)
Dim value As Text = pi.Value(obj)
Dim objectsInBucket() As Object
If buckets.HasKey(value) Then
objectsInBucket = buckets.Value(value)
End
objectsInBucket.Append(obj)
buckets.Value(value) = objectsInBucket
Next

BREAK[/code]

The buckets dictionary now contains only one entry instead of two. It is the one with “Kota” as key. If you switch the order of the objects in the array, it will be the one with “Knig” as key.

I tested other cases and this is what I could come up with:

  • One key has to be any letter from &uC0 to &uFF.
  • The other key has to be &uD6, the “latin small letter O with diaeresis” (note that the other two German letters with “umlaut”, and , will work fine.

A simpler example:

  dim d as new Xojo.Core.Dictionary
  d.Value( "Kota" ) = "Kota"
  d.Value( "Knig" ) = "Knig"
  
  MsgBox d.Count.ToText // Should be 2 but is 1 ("Knig")

Good, if unfortunate, catch. Please post the Feedback request so I can attach my test project.

This only happens if the keys are Text, btw. The equivalent String keys do not exhibit this bug.

That reminds me of a bug in the Dictionary class that I found in 2008. I even explained the reason for the bug to the programmer at RS and he didn’t believe me, despite me showing him that I could clearly reproduce the error. I was then fixing the bug myself and finally he agreed to the fix, despite still not understanding what the issue was.

In short, the issue was that hashing the keys is not enough to manage a dictionary. One has to also handle collisions (two keys having the same hash code), and therefore maintain the keys in a list per hash code - and that part was missing until 2008. It happened only rarely because the hash code was rather large, but imagine you using a Dictionary and only sometimes getting wrong results from it.

[quote=283822:@Thomas Tempelmann]That reminds me of a bug in the Dictionary class that I found in 2008. I even explained the reason for the bug to the programmer at RS and he didn’t believe me, despite me showing him that I could clearly reproduce the error. I was then fixing the bug myself and finally he agreed to the fix, despite still not understanding what the issue was.

In short, the issue was that hashing the keys is not enough to manage a dictionary. One has to also handle collisions (two keys having the same hash code), and therefore maintain the keys in a list per hash code - and that part was missing until 2008. It happened only rarely because the hash code was rather large, but imagine you using a Dictionary and only sometimes getting wrong results from it.[/quote]

I’m not the programmer in question, but I absolutely believe it’s a bug – though not that bug.

Agreed. It would be odd to see these two short strings having the same hash code. That’d be quite a flaw, indeed.
To make sure it’s not the same kind of bug, checking the hash values for these strings should be sufficient - if they’re not equal, it can’t be related. If they’re equal, then it can be the same problem.

I’d really like to see this fixed but still don’t see a Feedback request for it. Did I miss it, @Eli Ott ?

I didn’t file anything since it is not clear if this is really a bug. Maybe this is the intended behavior.

It would seem pretty preposterous that Xojo intended “Koñta” and “König” to be taken as the same key. That is a bug, no doubt about it. Who knows what other huge issues can be found with accented characters in keys ?

Or, is it because they are classes ? Even in that case, it would not seem normal that the keys be confused.

You might have missed this:

Please file. Let’s get it fixed.

I have not missed this. He believes…

Here’s a better demo with less clutter:

[code] Using Xojo.Core

dim buckets As New Dictionary()
For each v as Text in Array (“Kota”, “Knig”)
buckets.Value(v) = true
Next

if buckets.Count <> 2 then MsgBox “failure”[/code]

Note: As soon as the first line is removed, it does not fail any more. So it appears to be the Xojo.Core.Dictionary that’s the culprit here. Also, using String instead of Text fixes it.

Important: The text comparison that happens before inserting (as well as looking up) the text into the dictionary must be using the very same comparison rules that the hashing function uses.

The easiest way to make sure that happens would be to not use a comparison function with locale-aware options but instead use a fully binary (exact) comparison, but use the same “convert to lowercase” function for both the hashing and the strings that are used for the insertion and lookup comparisons.

However, even at lower case, these two strings would not appear similar. So this is probably a different bug. But I still implore the Xojo engineers to make sure the hash + comparison are secured the way I wrote above, regardless. (I say this in regards to that other long-standing dictionary I mentioned in a earlier post in this thread, saying that these things easily get overlooked and only happen to us non-English speaking, and only occasionally, which is the worst kind of bug.)

Eli, I must be missing something. You asked for opinions, which is fine. I do that often too. But then you got opinions that said, “yes, this is a bug,” including from @Joe Ranieri. He won’t know for sure until he actually looks at it, and he will only remember to look at it once there is a Feedback request.

What I’m asking is, what more do you need to file a report?

Well, any of us can file that bug report now, we have enough material in the thread.

feedback://showreport?report_id=45079 (beta bug, not accessible out of the beta group)

Thanks Michel.

Why did you make this a beta bug? It’s been around from the start, it seems

I did not test in previous versions. This is a good occasion to look into it anyway.

Well, I did, and I can confirm it’s happening in 2016r1.1

Well, you know the drill.