Small possible way to store huge amount of data in 64 bits app in the less amount of memory

Juan_Martorell · September 8, 2020, 4:45pm

Hi.

When. storing numbers as Double, they take 8 bytes of memory. If I save them as integer, they take also 8 bytes in 64 bits. I’m trying to store huge amount of simple data in multidimensional arrays. For me, storing the data as Uint8 will be enough as I believe is the varitype taking less memory for a possible single round number, but I’ve problems in getting Uint8 values from Double values.

How can I convert a double value into Uint8 value?
I’m creating multidimensional arrays of 8 columns. Each columns has to store integer values from 0 to 100, so the total amount of data in double is (100^8)*8 bytes, so 80 Gb of memory… what it’s absolutely impossible. If this would be saved as Uint8, the total amount of memory for the array would be (100^8)*1 bytes so 10 Gb of memory. Considering I’ve to create 18 of these multidimensional arrays and make math operations between them, what method would you recommend me?.

Best regards.

DerkJ · September 8, 2020, 5:40pm

1: Converting any value is easy with memoryblocks.

Var mb As New MemoryBlock(8)
// Set endianness
mb.DoubleValue(0) = myDouble
Var byte1 As Uint8 = mb.Uint8Value(0)
 Var byte2 As Uint8 = mb.Uint8Value(1) //etc..

2
You store it, by reading and writing it in chunks of a predefined size.
Never try to load such amount of data into memory without checking if there is AT least that amount of memory available. But since you are over 10GB you don’t want to try…
Use chunked read/write.

Jeff_Tullin · September 8, 2020, 6:02pm

you can store values of up to 100 in 4 bits
But getting at the values would be a pain ; one UINT8 would effectively be 2 cells of your array

I think you need to be doing this in a database.

but at the risk of being laughed at…

is your array
(100,8) in size?
Isn’t that just 909 bytes?

Not sure where you are getting 100 ^ 8 from…?

Paul_Rodman · September 8, 2020, 6:30pm

I think he means 8-dimensional arrays, with 100 (101?) items in each dimension. This sound like it’s likely to either be a sparse array, or an array where most of the entries will be, say, 0. Using sparse array techniques would probably work quite well (i.e. only store non-zero elements). OTOH, I have no idea of the application here.

Juan_Martorell · September 11, 2020, 9:55am

Thanks. This is a very good tip

Juan_Martorell · September 11, 2020, 10:03am

It’a a multidimensional array, so for example like:

var MyArray_1(100,100,100,100,100,100,100,100) as double

100^8 cames from the number of data of the array (100x100x100x100x100x100x100x100) and each data is taking 8 bytes of memory

Juan_Martorell · September 11, 2020, 10:05am

Yes Paul, you’re right they are sparse arrays, and the software needs to interpolate all 0 values and fill fully the array with data.

Christian_Schmitz · September 11, 2020, 10:15am

You assign the value.

dim d as double = 123
dim u as uint8 = d

Jeff_Tullin · September 11, 2020, 11:07am

Now Im piqued…
What does such an array represent?

TimStreater · September 11, 2020, 11:19am

Shouldnt this array be:

Var MyArray (100, 8)

?

Juan_Martorell · September 11, 2020, 11:28am

A multidimensional array…

https://documentation.xojo.com/getting_started/using_the_xojo_language/collections_of_data.html

Juan_Martorell · September 11, 2020, 11:30am

Thanks Christian, as always clever…

Don’t you have anything in your MonkeyBread tools for calculating NURBS surfaces?

Juan_Martorell · September 11, 2020, 11:33am

No. I need to store in the array values for 8 different variables (v1, v2, v3, v4, v5, v6, v7 & v8) and each variable can have values from 0 to 100.

Each variable is absolutely independent of the other ones, so the values of v3 could be from 0 to 100 independently of the values of the rest of variables.

The only way I know for storing this information is a multidimensional array (or a database of course…). The problem of the database is you’ve to be triggering from it for getting each value, so the calculations becomes 10 times slowest if you handle the values directly

TimStreater · September 11, 2020, 11:39am

Store them in an SQLite database table. Make it an in-memory one:

create table mytable (v1 integer, v2 integer, v3 integer, v4 integer, v5 integer, v6 integer, v7 integer, v8 integer);

An integer in an SQLite table takes only as many bytes as is required to store it, that is one in your case.

Jeff_Tullin · September 11, 2020, 11:46am

Thanks.
Good luck.

Christian_Schmitz · September 11, 2020, 11:59am

Probably something like

Dim MyArray(0,8) as Uint8

And then later when you know how many you need:

redim MyArray(count,8)

Jeff_Tullin · September 11, 2020, 12:24pm

I cannot conceive of any use for 100 x 100 x 100 x 100 x 100 x 100 etc

“values for 8 different variables (v1, v2, v3, v4, v5, v6, v7 & v8) and each variable can have values from 0 to 100.”

Starting with that, it’s simple
That starts as an array of UINT8, containing 8 elements
(82,51,22,93,52,54,99,100)
As you say, you could think of that as a long (8bytes) , or a double (8 bytes), or a memoryblock of size 8 bytes, or an array x as UINT(7)

So lets assume you have a simple array of 8 bytes / 1 UINT64

How many of those array do you actually need?
100 longs/byte arrays
100 x 100 longs/bte arrays?
or 100x 100 x 100 x 100 x 100 etc … and if it is this, what does it represent in the real world?

Paul_Rodman · September 11, 2020, 1:04pm

If they are sparse, you can use a Dictionary to store the non-zero values, with the indices combined to form a unique key. E.g.
For a three dimensional array 100 x 100 x 100,

Dim d as new Dictionary

Dim key as UInt64 = i * 10000 + j * 100 + k

d.Value(key) = storedValue // puts non-zero value into array

retrievedValue = d.Lookup(key, 0) // retrieves value, or zero if not stored.

Juan_Martorell · September 11, 2020, 1:46pm

Hi. I know it’s hard to understand…

Each variable is in fact the % value of 1 ink (printing ink). Each ink (color) can go from 0 to 100 (in fact from 0 to 255 as they are encoded in 8 buts, but for the array it’s enough having only 101 values instead of 256). You can print with 1, 2, 3, 4 (what it’s normal and you’ll know as CMYK), 6, 7, 8… or more inks. The software I’m programming is only able to handle up to 8 inks.

I’m building an ICC profile builder software. It takes some measurements, and builds an industry standard (ICC). internally, on this you’ve to build CLUT tables for each Lab color (like XYZ coordinates) so the programs like PhotoShop can interpolate between different colors and ming the right RGB, CMYK or MC (MultiChannel) value.

Basically, when you’ve to create a table in witch to look for been able to forming a particular color, by mixing of up to 8 different inks, you’ve to calculate all the colors than can be formed by mixing of all the inks, and you can use 1, 2, 3 or 6 of the inks for forming the color you’re looking for.

Basically it’s a 3D interpolation In the space… but you’ve 3 axis for the colors you’re looking for (XYZ or formally Lab) and up to 8 axis in the space into witch you’ve to build the target 3D point.

As the color space is a connote space in fact you’ve to calculate all values (in steps of 1 by 1), so I’ve to calculate the full multidimensional array.

As on the ICC structure you’ve 3 input tables and 3 output tables, you’ve to calculate 6 of these arrays.

I know it sounds crazy… but just for explaining you this is not because I’m trying to make something unreal or stupid.

Douglas_Handy · September 11, 2020, 1:50pm

No, for the record you can store values from 0 to 15 with 4 bits (i.e. 2^4) but need 7 bits to store values from 64-127 (i.e. 2^7) while a UInt8 can store from 0 to 255 (i.e. 2^8). So the least amount of bits needed to represent 8 values each up to 100 is 8 * 7 bits = 56 bits, or 7 bytes. But working with those for things like “18 of these multidimensional arrays and make math operations between them” would be much more complicated than just using UInt8 values for the OP’s stated desire to “interpolate all 0 values and fill fully the array with data”.

That said, it still isn’t clear to me what “interpolate all 0 values and fill fully the array with data” actually means in this case.