XojoScript Dictionary

I find the lack of a Dictionary class in XojoScript inconvenient, so I wonder how much work it would be to implement one? Since XojoScript recognizes Variants and Arrays, I assume it’s possible. Does anyone have code they’d share before I start hunting?

These are intrinsic types that are part of the language. Dictionary isn’t, which is why it isn’t available in the XojoScript sandbox.

I’m still unclear. Does that mean I can’t write my own within XojoScript, even if I limited the keys to string?

you can write one. Just make a simple (non hashed) dictionary class with two variant arrays in dictionary and a couple of functions.

Which brings me back to whether anyone has code they can share? I know nothing about creating a dictionary class at the moment.

Why don’t you use an environment for this that would do the job for you, then just ‘wrap’ the real Dictionary into a XojoScript Class ?

The XojoScript Class would then store a ‘handle’ and its Methods would call some functions on the environment, passing the handle, to trigger any Method on the ‘real’ Dictionary.

Think of it a C ‘talking’ to C++. Inside XojoScript you ask for the creation of an object to the host, you receive a handle, then you just use functions calls with a handle (‘this’ instance) and other parameters.

I understand what you’re saying, and if there were to be using with a project, that’s how I’d do it. But I intend for this to be used within IDE Scripts, so that’s not an option.

Besides, I’d like to learn how to code a dictionary for my own edification.

There’s a few different ways to do a dictionary / map. The simplest is to just have an array of keys and an array of values. Some much more advanced approaches are hash tables or trees (typically red-black trees).

Since I’m looking for speed, I’d like to do a hash table, but a tree might fit the bill nicely. Thanks.

[quote=82762:@Kem Tekinay]I understand what you’re saying, and if there were to be using with a project, that’s how I’d do it. But I intend for this to be used within IDE Scripts, so that’s not an option.
Besides, I’d like to learn how to code a dictionary for my own edification.[/quote]
Ah, I understand, yes.

Then perhaps I can share my little experience about this as I had to create a ‘Dictionary’ once in C. This is one approach, there must be a lot of others:

The idea is to have a fixed length array (256,1024,etc…). Each array slot point to a liked list element. When you get a ‘Value’, then you use a Hash function to the ‘Key’ that returns an index that has to be limited to the number of entries in your array (usualy you use some kind of classic Hash function then post-process it to ‘fit’ the result to 0 - UBound(yourhasharray).

Then you test is the first element of the linked list at that array slot equals your Key, if not you go to the second linked element, etc…

If you don’t find your Key then just add a linked element to the last one and store your Value. If you find your Key then replace the Value.

Having several Key/Value pairs at one Hash Array Slot is called ‘collision’. To minimize collision then simply set a larger array.

This is just one approach, there are a lot of others. Probably some others are more efficient but this works quite well though.

Nice, thanks.

Tries are really fun
Not sure you need to go to that extent but they can be really good for certain classes of problems
And they have a nice characteristic in that they are not dependent on the data set size but rather on the # of characters required to uniquely identify any single entry
For the small number of entries in the script I would not bother

Here’s my XojoScript take:

[code]class Dictionary
//- Dictionary class - functionally equivalent to the REALbasic one, except:
//- a. No BinCount property (yet).
//- b. Value(key) will return nil if key not found, rather than triggering an exception.
//- c. Key(Index) will return nil if Index is out-of-bounds, rather than triggering an exception.
//- d. Remove(key) will not trigger an exception if key not found.
//- Instantiation: dim d as new Dictionary

private dim hash(-1) as integer
private dim item(-1) as Pair

function Count() as integer
	//- Return number of items stored in the dictionary
	return ubound(item)+1
end function

sub Value(key as Variant, assigns v as Variant)
	//- Assign an item by key to the dictionary
	if key is nil then return
	dim h as integer = key.Hash
	dim i as integer = hash.IndexOf(h)
	dim p as new Pair(key,v)
	if i<0 then
		hash.Append h
		item.Append p
	else
		item(i) = p
	end if
end sub

function Value(key as Variant) as Variant
	//- Retrieve an item by key from the dictionary. Return nil if not found
	if key is nil then return nil
	dim i as integer = hash.IndexOf(key.Hash)
	if i>=0 then return item(i).right else return nil
end function

sub Clear()
	//- Clear all items from the dictionary.
	redim hash(-1)
	redim item(-1)
end sub

function HasKey(key as Variant) as boolean
	//- Return TRUE if an item with the given key exists in the dictionary
	return hash.IndexOf(key.Hash)>=0
end function

function Key(Index as integer) as Variant
	//- Return the Index-th key from the dictionary. Index = 0, 1, ... Count-1
	if Index>=0 and Index<=ubound(item) then return item(Index).Left
end function

function Keys() as Variant()
	//- Return an array containing all the keys in the dictionary (in 1-to-1 correspondence with Values())
	dim v(-1) as Variant
	redim v(ubound(item))
	for i as integer = 0 to ubound(item)
		v(i)=item(i).left
	next
	return v
end function

function Values() as Variant()
	//- Return an array containing all the values in the dictionary (in 1-to-1 correspondence with Keys())
	dim v(-1) as Variant
	redim v(ubound(item))
	for i as integer = 0 to ubound(item)
		v(i)=item(i).right
	next
	return v
end function

function Lookup(key as Variant, default as Variant) as Variant
	//- If a value with the given key exists, return it, otherwise return default
	if HasKey(key) then return Value(key) else return default
end function

sub Remove(key as Variant)
	//- Remove the item with the given key from the dictionary
	dim i as integer = hash.IndexOf(key.Hash)
	if i>=0 then
		hash.Remove i
		item.Remove i
	end if
end sub

end class
[/code]

Paul, forgive me, I don’t want to sound harsh here but your Dictionary implementation is inefficient.

The reason is that you use Array.IndexOf for searching which basically means whenever a Key is searched, the Array is scanned from start to end until the Key/Hash value is found. This is very slow, and will be slower and slower when the Array get bigger.

Also, in your implementation, the use of a Hash is totally useless. Better directly use the Key value instead, the result would be exactly the same.

The idea behind using a Hash value in a Dictionary is to be able to directly reach a slot in an array, instead of scanning the all array to find this slot.

Cheers,
Guy.

And Paul’s dictionary will not store two different keys with same hash.

This is what I came up with, including test code. This seems to run just fine, but, as you can see, I didn’t test it extensively. Comments?

  Class Dictionary_MTC
    Private Const kBinUbound = 4998 // 1 less than some prime number
    
    Private zKeys() As Variant
    Private zValues() As Variant 
    Private zBin( kBinUbound ) As Variant
    Private zRemovedCount As Integer
    Private zLastBinIndex As Integer
    
    Private Function pGetKeyIndex ( key As Variant ) As Integer
      if key = nil then return -1
      
      zLastBinIndex = key.Hash mod ( zBin.Ubound + 1 )
      if zBin( zLastBinIndex ) is nil then
        return -1
      end if
      
      dim arr() As Integer = zBin( zLastBinIndex )
      
      dim i as Integer
      for i = 0 to arr.Ubound
        dim keyIndex as Integer
        keyIndex = arr( i )
        
        if zKeys( keyIndex ) = key then
          return keyIndex
        end if
      next i
      
      return -1
    End Function
    
    Private Sub pRaiseException()
      // Just raise a NilObjectException for now
      dim arr() as Integer
      arr = nil
      dim i as Integer = arr.IndexOf( 1 )
    End Sub
    
    Sub Value ( key As Variant, Assigns value As Variant )
      if key = nil then pRaiseException()
      
      dim keyIndex as Integer = pGetKeyIndex( key )
      if keyIndex <> -1 then
        zValues( keyIndex ) = value // Overwrite the existing value
      else
        zKeys.Append key
        zValues.Append value
        
        dim binIndex as Integer = zLastBinIndex
        dim arr() as Integer
        if zBin( binIndex ) <> nil then
          arr = zBin( binIndex )
        end if
        arr.Append zKeys.Ubound
        zBin( binIndex ) = arr
      end if
    End Sub
    
    Function Value ( key As Variant ) As Variant
      if key = nil then pRaiseException()
      
      dim keyIndex as Integer = pGetKeyIndex( key )
      if keyIndex = -1 then
        pRaiseException()
        return nil
      end if
      
      return zValues( keyIndex )
    End Function
    
    Function Lookup ( key As Variant, default As Variant ) As Variant
      if key = nil then pRaiseException()
      
      dim keyIndex as Integer = pGetKeyIndex( key )
      if keyIndex = -1 then
        return default
      else
        return zValues( keyIndex )
      end if
    End Function
    
    Function HasKey ( key As Variant ) As Boolean
      if key = nil then pRaiseException()
      return pGetKeyIndex( key ) <> -1
    End Function
    
    Function Count() As Integer
      return zKeys.Ubound + 1 - zRemovedCount
    End Function
    
    Function Keys() As Variant()
      dim k() as Variant
      
      dim i as Integer
      for i = 0 to zKeys.Ubound
        dim thisKey as Variant = zKeys( i )
        if thisKey <> nil then
          k.Append thisKey
        end if
      next i
      return k
    End Function
    
    Function Values() As Variant()
      dim v() as Variant
      
      dim i as Integer
      for i = 0 to zValues.Ubound
        if zKeys( i ) <> nil then
          v.Append zValues( i )
        end if
      next i
      return v
    End Function
    
    Sub Remove ( key As Variant )
      if key = nil then pRaiseException()
      
      dim keyIndex as Integer = pGetKeyIndex( key )
      if keyIndex = -1 then pRaiseException()
      
      zKeys( keyIndex ) = nil
      zValues( keyIndex ) = nil
      zRemovedCount = zRemovedCount + 1
      
      dim arr() as Integer = zBin( zLastBinIndex )
      arr.Remove arr.IndexOf( keyIndex )
      zBin( zLastBinIndex ) = arr 
    End Sub
    
    Sub Clear()
      redim zKeys( -1 )
      redim zValues( -1 )
      redim zBin( -1 )
      redim zBin( kBinUbound )
      zRemovedCount = 0
    End Sub
  End Class
  
  // TEST CODE
  dim d as Dictionary_MTC
  d = new Dictionary_MTC
  d.Value( 1 ) = 2
  d.Value( 1 ) = 3
  print d.Value( 1 )
  print d.Lookup( 1, 88 )
  print d.Lookup( 2, 99 )
  if d.HasKey( 1 ) then
    print "Has 1"
  end if
  if d.HasKey( 2 ) then
    print "Has 2"
  end if
  
  print str( d.Count )
  
  d.Remove( 1 )
  print str( d.Count )
  try
    print d.Value( 1 ) // Exception
  catch
    print "Exception"
  end try
  
  dim k() as Variant
  k = d.Keys
  print str( k.Ubound )

Looks fine to me.

Perhaps I would have handled zLastBinIndex differently as storing it as a property is not thread safe (well to make it fully thread safe there are other areas that need attention anyway). I would have returned zLastBinIndex and keyIndex as a Pair from pGetKeyIndex albeit this is only useful for Remove().

Another comment I have, in fact a question, is why don’t you call directly Raise New NilObjectException instead of simulating it in your pRaiseException Method ?

Aside from that I think you have chosen to make Keys() and Values() fast over Value()/Lookup() speed. What I mean is that you could have stored the Key/Value as Pair directly in the zBin array and that would have been more efficient for Value()/Lookup(). But this would have been less efficient for Keys()/Values() as well. That’s a choice. It depends on how often you are supposed to call the Values()/Keys() Methods against the Value()/Lookup() ones.

Thanks for the comments. As near as I can tell, neither Pair nor Raise are supported in XojoScript, nor are Threads, hence my choices. Perhaps I could have used a ByRef parameter to return the bin index, but it was 4:30 AM and I got lazy. :slight_smile:

Oh, I forgot you were targeting XojoScript, indeed. Forget about my comments then.

I considered implementing my own Pair class and doing something similar to what you suggested, but then I’d lose one of the features: The ability to retrieve the Keys and Values in the order they were entered. I couldn’t think of another way to achieve that while storing the actual values within the Bin array.