Dictionary Vs Array tradeoffs

I know the differences between dictionaries and arrays… but when do they start mattering in terms of speed?

Let me explain… I can have a list of strings all 255 Chars or less, and usually much less (maybe 25-35). I need to associate each string with a sequential number that starts from 0…

I can create a dictionary with the string as the key and number as the value to look up the number.
Or I can just put the strings into an array and use array.indexof

There is no question that while it will use more memory, for a large number of strings a dictionary would be much faster…

But what about for a small number of strings? Say 20, 30, or 100? Would Array.indexOf really be significantly slower?
What I am asking is for case like this, how many elements does it take for dictionary to be significantly faster and make it the best choice?

  • Karen

In my experience a dictionary lookup takes about 30,000 milliseconds. So if stepping through the array takes less…

I would advise that you do some profiling, with varying record amounts, and different methods for testing the values on an array. I’d also suggest that you test with record amounts that you don’t expect, you know when the unexpected happens.

Another way you could do it is to create a single memoryblock, insert the data in a known format such as ID. Use InstrB to find the matching string, then use it again to find the ID offset (using the location of the string as a starting point). InstrB can be really fast although I’ve never actually profiled it.

I would be the array is much faster. Just test it out.

depends on how you’d use arrays to mimic dictionaries
if you have “key” and “value” then arrys need 2 matched arrays - one for keys and the other for values

that said here my little test method

Public Sub testSpeedDiff(param1 as Integer)
  
  Dim keys() As String
  Dim values() As String
  
  Dim dict As New dictionary
  
  // now into the arrays we put keys and value and the same set into the dict
  
  Redim Keys(param1)
  Redim values(param1)
  
  TextArea1.AppendText "Size = "  + Str(param1) + EndOfLine
  
  
  For i As Integer = 0 To param1 
    
    keys(i) = Str(i)
    values(i) = Str(i)
    
    dict.value(Str(i)) = Str(i)
  Next
  
  Dim randomKeys() As String
  Dim r As New random
  
  For i As Integer = 1 To 100
    Dim keyToLookup As Integer = r.InRange(0, param1)
    randomKeys.append str(keyToLookup)
  Next
  
  Dim startTime As Double
  Dim endTime As Double
  
  starttime = Microseconds
  For i As Integer = 0 To randomKeys.ubound
    Dim idx As Integer = keys.IndexOf( randomKeys(i) )
    Dim s As String = values(idx)
  Next
  endTime = Microseconds
  TextArea1.AppendText "arrays took " + Str(endTime-startTime) + " msec" + EndOfLine
  
  starttime = Microseconds
  For i As Integer = 0 To randomKeys.ubound
    Dim s As String = dict.Lookup( randomKeys(i), "" )
  Next
  endTime = Microseconds
  TextArea1.AppendText "dictionary took " + Str(endTime-startTime) + " msec" + EndOfLine
End Sub

Size = 10
arrays took 66.31799 msec
dictionary took 299.209 msec
Size = 100
arrays took 353.892 msec
dictionary took 192.686 msec
Size = 1000
arrays took 3204.535 msec
dictionary took 205.907 msec
Size = 10000
arrays took 38622.73 msec
dictionary took 277.168 msec
Size = 100000
arrays took 422759.3 msec
dictionary took 354.648 msec

dictionaries have a much “flatter” perforamce curve mostly unrelated to their size (it is but require some math to explain that I just dont feel like going into - look into the performance characteristics of red-black trees)

arrays are much more linear compared to their size

1 Like

Thanks All…

No surprise … As Norm’s data illustrates, my testing with random strings of different lengths shows that it depends both on string length and # of items

With strings in the size range I expect most of the time, they are about the same speed with 25-30 items … which is also what i think the typical range for number of items will be… so If the #of items is <= 30 I will use an array and if more a dictionary.

If there was an array.IndexOfB I suspect the array could have more elements.

  • karen

I benchmarked the old dictionary vs array several years ago and my data was similar to Norm’s. In the use case I was most interested in at the time, I had a set of about 100 strings with an average length of 20 characters, and for that case the dictionary was dramatically faster (>30X) than a brute force search through the array.

However, if I sorted the array and did a binary search, then the array was 4X faster than a dictionary, for my data set.

So a binary search might be the best overall solution (if sorting the array is acceptable).