When to use array, pair, dictionary or collection

Here is a hardcore question for the programming department :grinning: :

I have an 3-dimensional array of integers with 3 decimal (double) values in each cell. So I would use this:

Var arr_dec1(5000,5000,5000) as double
Var arr_dec2(5000,5000,5000) as double
Var arr_dec3(5000,5000,5000) as double

But I do not know if it will be 5000 or much more, and only a fraction of the cells will be used/filled. So my question is now: Does it make sense to take another solution than an array? Pairs, a dictionary or a collection? With arrays I can access the content directly with

res = arr_dec1(a,b,c)

But how much space will that take? Is the array filled at the beginning when it is build - or will it “grow up” every time a new value is assigned? May I use 5000000 for each dimension? (This works!) At least this does not sound very smart to me. And how would I access any other storage model compared to this simple line above?

Thank you for any idea or help!

are these arrays “dense” (i’e every cell is filled) or not ?
do you need to be able to dynamically grow and shrink the arrays ?

5000 * 5000 * 5000 * 8 is 931 GB.
Unlikely you have that memory.

Maybe better use a dictionary with a good key.
e.g. x-y-z with numbers as index.

str(x)+"-"+str(y)+"-"+str(z) and use that as key.

Or maybe more efficient use Int64

(x * 10000 + y) * 10000 + z

If all values are eventually filled, then you’ll need the full space (1 TB). Best to use a binary file in that case, on a disk that has enough free space, and ideally on an SSD.

Then calculate the offset into the file like this:

dim offset as Int64 = (x * (5000 * 5000) + y * 5000 + z) * 8

8 is the size of a Double value.

Then read or write a DoubleValue using a BinaryStream, after setting the stream’s Position to the offset.

If it’s not all filled, the use a Dictionary, with the offset (without the * 8 factor) as the key, like Christian suggests - that’ll be the most efficient (besides, Collections are super slow, unless Xojo has improved them using a map now?)

Or I have an old sparse array class on my website that might work for this

Wow, thank you all first.

No, the array will not be filled. And I tried to make one with 2.000.000 entries per dimension and it worked. Does that mean the array “grows up” when filled? If this is the case I will stay with the array. It is max. filled 5% 
 does anyone know how the memory usage of an array is built? Does it use up half the memory if i would store something in value 2500?

If this may (or will) cause trouble I like Christians idea most. But is reading a dictionary as fast as an array-access? Speed will be an issue here. And will it work the same way on iOS (and Android)? And I have one problem left: How to access all parts of one dimension sequentially? If I would use a dictionary I can not imagine how this would work.

So I can forget thinking about collections. And what about pairs? What is that good for?

Thanks again!

memory pages allocated, but never touched, don’t use physical memory, only virtual.
So it may work to allocate 1 TB of memory, but only use a few bytes.

Check also IntegerToVariantHashMapMBS class, which would allow you to use integer as key directly and be faster than Xojo’s dictionary with variant as key.

1 Like

Christian, thank you, I am sure this would work on the desktop/web. But it will not on iOS - but pairs will do.

I want to setup an algorithm that works in memory and on all platforms - that is why I am using 
 XOJO 
 :anger: (see the xojo symbol here)

And I am sure there will be a solution - as always :slightly_smiling_face:

Be aware that iOS devices are very memory restricted.
e.g. most iPhones have closer to 1 GB of memory, than 6 GB on the high end models.

Beside the space concern, if all values of the same index are tied together, I’d possibly use an array of classes (the class would hold your 3 values). You’d have a single array of your class and unused “indexes” would simply not appear in the array.
From there, if you need a quick way to know whether a specific index has been filled, a dictionary (key=index, value=object of your class) would be an option (use Dict.HasKey).
Just my 2 cents.

1 Like

I am just thinking about using pairs as a pointer to the index (of the array of variants). This would fill up the matrix in a compressed form.

Dictionary will not work with iOS.

Use Xojo.Core.Dictionary

1 Like

So now look at this. First I use a routine to “compress” the index count:

Public Property aCriterias() as Pair

Public Function Arr_set_crt(ptr as integer) as Integer

  ' look for ID in the array and append one if not found
  Var i As Integer
  For i=0 To 5000
    If i > aCriterias.LastRowIndex Then ' not found yet, add one
      aCriterias.AddRow( i : ptr )
      Return i
    End If
    If aCriterias(i).Right = ptr Then Return i ' found, leave ....
  Next 
End Function

This works and the array of indexes is filled correctly after it ran through.


Public Property aCells(5000,5000,5000) as Variant
Public Property aTuple(3) as Double

Do
  ' fill the array - but only with currently selected ID
  o = Arr_set_opt(rs.Field("rating_option").IntegerValue)
  c = Arr_set_crt(rs.Field("rating_criteria").IntegerValue)
  u = Arr_set_usr(rs.Field("rating_user").IntegerValue)
  ' filling my element of the array
  aTuple(0) = 0.5
  aTuple(1) = Val(rs.Field("rating_rank").StringValue)
  aTuple(2) = rs.Field("rating_value").DoubleValue
  aTuple(3) = 0
  ' assign my content to the appropriate array cell
  aCells(o,c,u) = aTuple   ' and here it crashes completely 
  rs.MoveNext
Loop Until rs.EOF

This does not work and I do not understand why. It does not end in the debugger but it stops the complete application with “quit unexpetedly” and wants to report to Apple :dizzy_face:. Anyhow: The first o,c,u values are zero. I suppose it is not possible to assign an array as a Variant to a cell of another array? I tried this here and it worked:

aTuple(0) = 0
aTuple(1) = 0.4
aTuple(2) = 0.9
aTuple(3) = 3
'
aCells(123,144,678) = aTuple

This is a hairy beast.

Thanks Norman - I will do that. Very nice. But even if I have that dictionary in place it will not work. It prevents only a bloated array 
 the problem seems to be in the assignment of an array as a Variant to another array.

you can absolutely do it
id post a link but 

search my blog for “dictionary of arrays”

1 Like

So now I “shrinked” it down to the point where it comes to the crash. I am using this code here:

Dim aTuple(3) As Double            
Dim aCells(600,600,600) As Variant
'
aTuple(0) = 0
aTuple(1) = 0.4
aTuple(2) = 0.9
aTuple(3) = 3

aCells(10,5,2) = aTuple

aTuple = aCells(10,5,2)
MsgBox(Str(aTuple(2)))

This works and uses 1.63 GB in RAM - so the array is bloated up at the moment I fill only one element. :astonished: So no “pay per use” in the array.

If I go higher with the dimensions it crashes without any warning or message. On my Mac 650 is already too much.

For my application this will mean, that max. 15% of the original array (dimensioned 5000 each) can be filled. I am not sure if this will be a lethal limitation at the end.

Finally Normans tip for using the core dictionary did the job. The array is condensed now as good as possible - I can proceed with my work now. Thanks to all.

It only SEEMED to work. But it did not. If I add a second tuple at another call of the array it does not get the first one back again:

Dim i,c,o,u As Integer
'
Dim aTuple(3) As Double              
Dim aCells(600,600,600) As Variant
'
aTuple(0) = 0
aTuple(1) = 0.4
aTuple(2) = 0.9
aTuple(3) = 3
aCells(10,5,2) = aTuple

aTuple(0) = 1
aTuple(1) = 2
aTuple(2) = 3
aTuple(3) = 4
aCells(11,6,3) = aTuple

aTuple = aCells(10,5,2)
MsgBox(Str(aTuple(2)))

This should reference the entry in 10,5,2 and give back 0.9 - but it shows 3, the entry of the second tuple. Finally I regard this as an error in XOJO now, not able to handle Variants.

I will try Christians solution now 


Next try - using classes to setup the structure:

Screenshot 2020-08-22 at 13.24.02

And the code:

Dim c As New Cells
Dim t As New Tuple
'
t.Heat   = 0
t.Rank   = 0.4
t.Rating = 0.9
t.Result = 3

c.Cell(10,5,2) = t

t.Heat   = 1
t.Rank   = 2
t.Rating = 3
t.Result = 4

c.Cell(11,6,3) = t

t = c.Cell(10,5,2)
MsgBox(Str(t.Rating))

It still shows 3 as the result. What is wrong here? Did I misunderstand something obvious? Is this really a bug?

u used the same object t
with different data you need t = New Tuple

you can not use it this way?

Class Rating with Propertys
RatingOption
RatingCriteria
RatingUser

var Cells() As Rating

instead of Cells(RatingOption, RatingCriteria, RatingUser) =

1 Like