Delete/Change Dictionary Keys

I know when deleting items from an array, the common practice it to read the array from bottom to top so the number of items in the array never becomes less than the index being referenced…

But how does one do the same thing with a Dictionary (since the physical order is “random”)

I have two things I need to do.

  1. Delete all items from the dictionary where the key is or matches a particular pattern (keys are PAIRS), so it may be “DELETE if key.left=“X”” or it might be “DELETE is key=pair(3:2)”
  2. Alter a series of keys … “IF key.right>5 then key.right=key.right+1” … without raising an issue by creating a conflicting key

The more I think about this, the more an in-memory database might be better than a dictionary

Just delete the entries. The keys of the remaining entries don’t change - as do the indexes of the remaining entries in an array if you remove an array entry which is not the last one.

Thanks… wasn’t sure if that was static in a dictionary or not.

What about #2… the only solution I can come up with is to create a new dictionary as I read the keys from the first

Check with HasKey before setting the new value.

Not gonna wor[quote=169894:@Eli Ott]Check with HasKey before setting the new value.[/quote]
Not gonna work…

Say I have a Dictionary with 2 entries
1:2 aaaaaaa
1:3 bbbbbbb

and I want to increment the right side of each Key
When I attempt to increment the 1:2 key it then creates TWO 1:3 keys for that instant,
HasKey will simply tell me I’m about to “screw the data up”, but doesn’t provide a solution to get around it

the desired end result in this case would be

1:3 aaaaaaaa
1:4 bbbbbbb

You can get an array of keys using the Keys method. Then just loop through that.

You’re using objects for keys
So despite them having “the same values” in the data (left & right) they may still be distinct objects and the object references are unique
Hence you can get “duplicates”
One reason I did not use an object for my sparse array class so many years ago
A string that is carefully crafted lets you retain the uniqueness & the o(1) lookup
Using objects wont because the object reference is the key now

So finding 1:2 requires you to iterate over the dictionary and find ANY object that contains 1:2

Not sure how that keeps me from having two identical keys (or actually wiping out one entry by over-writing the key)

for each key as pair in mydictionary.keys

vs

for i=0 to mydictionary.count-1
   key=pair(mydictionary.keys(i)))

the instant I change the value of “key” in either case, I lose the cell that has the same key that I had not gotten to yet

1:2 aaaaaaa
1:3 bbbbbbb

  • key is 1:2
  • increment right (key is now 1:3)
  • but there the next record is also 1:3, which I haven’t yet processes (it would become 1:4)

got an example… I cannot see any way to alter a dictionary key in place

Ah but you DONT :stuck_out_tongue:
You have two distinct objects that hold the same data.
Thats not “identical keys” as far as the dictionary is concerned - they’re two distinct keys (instances) that hold the same data.

Thats VERY different

[quote=169936:@Norman Palardy]You’re using objects for keys
So despite them having “the same values” in the data (left & right) they may still be distinct objects and the object references are unique
Hence you can get “duplicates”
One reason I did not use an object for my sparse array class so many years ago
A string that is carefully crafted lets you retain the uniqueness & the o(1) lookup
Using objects wont because the object reference is the key now

So finding 1:2 requires you to iterate over the dictionary and find ANY object that contains 1:2[/quote]

Ok… taking the “pair” out of the equation… and using a “carefully crafted string” doesn’t change the problem

if the dictionary had keys of
“1”
“2”

and I wanted to end up with
“2”
“3”

the same issue exists

And are you saying that by using PAIR as my key type, I am doing “a full table scan” everytime I ask for a value, I search the entire dictionary thus losing the benefit of the hash?

so even if a PAIR doesn’t take advantage of the speed of a dictionary (which I DO need), a string creates the anomly the I’m describing?

Here’s a quick demo of what I mean

dim p1 as Pair = 1:2
dim p2 as Pair = 1:2

if p1.left = p2.left and p1.right = p2.right then
msgbox “p1 data = p2 data”
end if

if p1 = p2 then
msgbox “p1 = p2”
end if

You expect the first where the data is the same is how the dictionary compares keys
But really it does the second

[quote=169943:@Dave S]Ok… taking the “pair” out of the equation… and using a “carefully crafted string” doesn’t change the problem
[/quote]
Yes it does as STRING as keys the value of the string is compared - not the object

So you dont have to scan to see IF the key already exists
You can use HASKEY and know in one line of code

You bet
Or worse you’re getting false “doesn’t exists” checks becuase you’re checking object

With carefully crated string this is literally if dictionary.haskey( what ever ) then dictionary.remove
Using objects you have to scan EVERY value

This one you don’t have any choice but to scan since you’re altering “half” the key
But at least you can see if the proposed new value exists, which is now just HAS KEY, and take appropriate action if it does
With objects this is yet another full scan

Try it

[code] dim p1 as Pair = 1:2
dim p2 as Pair = 1:2

dim d as new dictionary

d.value(p1) = “true”

if d.haskey(p2) then
msgbox “exists already”
else
msgbox “NOT”
end if[/code]

I was suggesting that you pull the keys once, so the order doesn’t change on you.

dim k() as variant = dict.keys() For i as integer = 0 to ubound(k) ...do something Next i

right… but that still doesn’t solve the problem of CHANGING the key, regardless if it is a PAIR or STRING datatype

either that or I am missing something so fundemental I just can’t see it… But again…
If I have two entries

Key=1 data=A
Key=2 data=B

how can I end up with

Key=2 Data=A
Key=3 Data=B

The idea is the KEY represents a Row/Column (currently a PAIR, but based on Normans data soon to be changed to a string representation of the same data). So the dictionary represents a grid with only populated cells, and I want to “insert” a row or column, which means that the keys for existing dictionary entries need to be changed, without loss of data, and without changing the data portion of then entry.

This won’t work

dim k() as variant = dict.keys()
For i as integer = 0 to ubound(k)
    newkey=inc_row(k(i))
    dict.value(newkey)=dict.value(k(i))
Next i

because when K(I)=1, then newkey=2 and I end up with

Key=1 data=A
Key=2 data=A

Because it over-wrote an existing value. Checking with HASKEY doesn’t help, because it could be a cascading effect
The only workable solution I came up with (and it takes TWICE the memory is)

dim k() as variant = dict.keys()
dim new_dict as new dictionary
For i as integer = 0 to ubound(k)
    newkey=inc_row(k(i))
    new_dict.value(newkey)=dict.value(k(i))
Next i
dict=new_dict  // since DICT is global, and new_dict is local, I am assuming this releases the "old" DICT, and DICT becomes NEW_DICT?

Ah, so if you’re inserting a row, you’ll also need to adjust every cell. Let’s say you’ve got 9 rows of 5 columns and you want to insert data at row 5:

For row as integer = 10 downto 5 For column as integer = 0 to 4 If dict.haskey(str(row) + "|" + str(column)) then Dict.value(str(row + 1) + "|" + str(column)) = Dict.value(str(row) + "|" + str(column)) End If Next column Next row

Now you can Create a new row at line 5. I can’t help but think that this is going to be terribly inefficient if the number of columns gets high though (or rows if you’re inserting a column)

You need to work the keys from highest to lowest when you insert.

If I might chime in here… (I’ve been working on DataView for many months now and I ran into this same wall early on)

It looks like you’re trying to work out inserting a new column into your data…

You really shouldn’t link the data of a cell directly to the position in the list. Think of a row as a set of data. The list presents the rows in a certain order and accesses the data in that (arbitrary) order. A column isn’t a position in the row, it’s a piece of data common to the rows, with each row having a value for that column (or not).

The way I tackled this in dataView was to use an array of rowIDs and an array of columnIDs.

then you can have the dictionary store

row:column=value

100:1=aaa
100:2=bbb

with the columns= {1,2}

now when you insert a new column…

columns={1,3,2}

add a key if you like
100:3=ccc

lets say the rows() array looks like
88
100
102

your row will access the data by the keys in the arrays
| rows(0):columns(0) | rows(0):columns(1) | rows(0):columns(2)
translates to 88:1 | 88:3 | 88:2

                   | rows(1):columns(0) | rows(1):columns(1) | rows(1):columns(2)

translates to 100:1 | 100:3 | 100:2
aaa ccc bbb

The only time you need to loop through the keys, is when you want to delete a column and all of it’s data.

And I would advise against using pairs as keys (objects have overhead and pairs have no comparison operators)… try a structure and use stringvalue for the key.

[quote=169989:@Greg O’Lone]Ah, so if you’re inserting a row, you’ll also need to adjust every cell. Let’s say you’ve got 9 rows of 5 columns and you want to insert data at row 5:

For row as integer = 10 downto 5 For column as integer = 0 to 4 If dict.haskey(str(row) + "|" + str(column)) then Dict.value(str(row + 1) + "|" + str(column)) = Dict.value(str(row) + "|" + str(column)) End If Next column Next row

Now you can Create a new row at line 5. I can’t help but think that this is going to be terribly inefficient if the number of columns gets high though (or rows if you’re inserting a column)[/quote]

This solution requires having to reiterate over all the possible row/column combinations… it could be hundreds of thousands (the dictionary may only have 2 entries, but they may be [10,10] and [153032,190032]… making it process 29,000,000,000 loops that are a total waste of time… The memory restriction of using TWO dictionary is more appealing.

Jim… your solution won’t work either… as it requires arrays … which are not sparse, plus solves the problem for inserting of columns only. In my above example… you’d have two entries each with an array of near 200,000 entries of which only 2 are populated.

My dictionary structure is
Key=[row,col] Data=aaaaaaa [where data is a class object]

My need is to represent a virtually unlimited grid area in as small a foot print as possible, while being able to insert (or delete) a row and/or column (or range of rows and/or columns) as fast as possible

It may not help, but here is pseudo code for something I do that may have something in common.

Firstly, a theoretically infinite but sparse set of things that can exist at x,y (your pairs)
Each of these things have attributes, lets say right and left values.

So we have a class with RIGHT and LEFT properties, and a sparse ‘array’ of these.

You need to find things in ‘row’ 10 , or something at 10,30, then do something with it.

What I have done is to generate a dictionary, where the key is the x,y co-ordinate assembled into a decomposable string

So: object at 30,40 is

MyDictionary ("30|40") = thisobject

Now, to find the object at specifically 100,200, its

MyDictionary.HasKey (“100|200”)

and if I want all the items in row 100, I iterate through the whole dictionary (Could be slow but I havent found it to be so yet)
looking for keys that start with “100|”

[code]for each k in MyDictionary.Keys

if left(key,3) = “100” then
//stuff
end if
next[/code]

Any help, or way off the mark?

Curse this inability to edit!

if left(k,3) = “100” then