Thank you @Rick_A for your explanation.
The keys and data are strings of variable lengths.
In what limited time I had yesterday, I devised the following strategy, which seems to be similar to what I understand from your suggestion.
The datafile consists of the following sections.
- Header, including values to convert UniqueID into an index.
- LOOK table, a series of Ptrs (basically a LUT)
- DATA table, where the keys and data are actually stored.
The process should flow… (ha!)
- Convert the key into a unique number by using a custom hash.
- Convert that number into an index, by scaling it to fit within 0 and numberOfKeys, multiplying it by ptrSize and appending the LOOK table offset.
- Use the index to go straight to and read the dataPtr from the LOOK table.
- Use the dataPtr to go straight to the correct position in the DATA table and read the data.
There will be collisions, so the collided data is stored in arrays along with the keys, which means it can step through to the correct key.
It is far from perfect, but it should give me what I want. I can see the first custom hash is probably where most of the time will spent in getting it right. I need it to be quick, and at the same time produce as less collisions as possible.
The hardest part is what do I call this?
Ohanaware Disk Indentured Dictionary?