Comparing 2 arrays for differences

I could use some direction as to how I might do this. I have two small one dimension arrays. Perhaps up to 10 elements in an array but mostly 3 or less. I need to compare the two arrays and find the differences.

  1. Each array may have an equal number of elements or not. If they are of equal size they might not be identical.
  2. One array may have more elements than the other. In this case I need to find out which element is not common to both and then do something with it depending on which array it is not present in.

So counting the two arrays I can setup “If Then Else If” for each of the 3 possible element counts. Equal element count as Array1=Array2, Array1 > Array2, and Array1<Array2. It’s how to do the comparison of the two arrays that I’m hoping someone has some clever insight on.

I’m not sure I even need to count the 2 arrays? Maybe just compare them. Actually, the elements of the 2 arrays that are identical I can forget about. It’s the differences that I need to deal with. Thanks.

Create a Dictionary and set its keys using Array1. Cycle through Array2 and check to see if that key is in the Dictionary. If not, add that item to another array and remove it from the Dictionary. Once you’ve gone through Array2, transfer any remaining keys in the Dictionary to the end of your result array.

dim d as new Dictionary
for each v as Type in Arr1
  d.Value( v ) = nil
next v

dim r() as Type
for each v as Type in Arr2
  if not d.HasKey( v ) then
    r.Append v
    d.Remove v
  end if
next v

dim keys() as Variant = d.Keys
for each k as Variant in keys
  r.Append k
next k

(Not tested, just to give you an idea.)

Ah, I missed how you have to know which keys belong to which array. In that case, set up two arrays for your result, one to hold the values in Array2 that are not in Array1, and the other to hold the unique keys in Array1 they you’ll get from the remaining keys of the Dictionary.

Duane,

after read of your text, I think this is not complete specification of task. For example, no word about sorting of array: if they are sorted initially? if not, can they be sorted?

Also, try to write at least specification of this function
CompareArrays(a1,a2) as WHAT?

What will be result? Array? Some structure? How you going return from function that differences?

No Ruslan, it’s not complete, I didn’t want to over spec it as it’s a bit complicated. These are or will be records in a database. I get an XML array for one and the other is a MySQL database. I figured I’d turn the MySQL records into an array and then compare.

Anwyay, after taking a break it all came clear. I’m going to delete array1 and the records associated with and it and copy array2 into it. That’ll do it.

But I do find this dictionary solution interesting and my have potential.

Thank you both for your suggestions.

The probably fastest way (which isn’t needed in your case, but may be helpful to others looking this up) is to first sort the two arrays, and then use two index variables, moving each up along with its counterpart.

Basically, after you’ve sorted both arrays, you know that if you compare a(0) with b(0), you can tell if they’re equal. If not, you know which one is further than the other - so you move the minor one’s index up until you reach the other side again. Then you know the difference.

Also, if you download the “CustomEditField” code, you’ll find a “diff” example in there, which shows how to compare two arrays of text lines, learning their differences. This is probably overkill here, too, so it’s just for the record.

If memory serves, I compared the speeds of the sort/crawl method vs. the Dictionary method and the latter was faster unless the arrays were already sorted. I did this on the mailing list a while ago.

Naturally, for small data sets, it doesn’t really matter.

Kem, I guess that also depends on the type of data. Some data, like long strings, might take longer to process (for the hash code generation, mainly).

Also, the sort/crawl can deal with duplicates, while the dict method doesn’t, correct? And the dict method doesn’t let you learn of sequences, the way you need it when you compare two texts.

So, for cases where dupes and sequences are of no importance, the dictionary method is probably preferrable because it’s easier to implement.

this is my method for checking if two string arrays have the same elements disregarding in wich order they are

[code]Samearrays (a() as string, b()as string) as BOOLEAN

if ubound(a)=UBound(b) and (UBound(a)+UBound(b))>=0 then
dim i as integer
dim k as integer
dim ok as integer
ok=0 ’ ok counts how many times there are the same elements
for i=0 to ubound(a)
for k=0 to UBound(b)
if a(i)=b(k) then ok=ok +1
next
next

if ok>UBound(a) then return true

else
'different dimensions and/or one of the 2 arrays is empty
'return false
return false

end if[/code]

If part of the data is already in a database… why not temp load the other data into a table and let SQL do the work for you?