Best method to compare pictures?

I need to compare lines and rows of a picture to find identical lines and rows.
As far as I know there is no operator “picturecompare” or something like that, so in a first quick & dirty hack I instantiate two strings and fill them in a loop with the rgbsurface.pixelvalues of two concurring lines and perform a simple “if row1 = row2 …”. This works but is incredibly slow for bigger pictures.

I know I could replace the strings by Memoryblocks like in the Xojo Faststring example. This would surely speed up things, but I guess there must be a much better method.

Instead of filling a string pixel by pixel I tried to declare two new picture objects, filling them with a row by using picture.graphics.drawpicture with the parameters set to copy a 1 pixel high/broad row or line. This seems to work (and quite fast), but it brings the problem with comparing pictures up again.

I tried to copy the pictures to TIFF memory blocks which runs and is very fast again, but again I cannot successfully compare them. Seems they are filled differently, or rather the match fires now a hundred lines or so later than the string compare, in the middle of my test picture instead of close to its border where the algorithm starts.

So I wonder if there is a way to copy a picture’s row or line completely and fast to an object which contents can be successfully compared? Or maybe I can combine two pictures with an OR and check if the result is 0 which it should be if both are the same. But I’m not really successful trying that.

Do you have any hints?

addendum: the last operator should rather be XOR, of course.

Maybe you could speed up things if you only checked one pixel at a time until you found a difference. Pseudocode:

While i< image1.width if image1.row1.pixel(i)=image2.roiw1.pixel(i) then i=i+1 //check the next pixel if i=image1.width then // Found a match, do whatever needs to be done here end if else //the rows are not equal, take the corresponding action end if wend

It’s not what you were looking for but might speed up your current method.

Pixe

Here is an implementation of Pixe’s idea:

Sub findIdenticalColsAndRows(pic1 As Picture, pic2 As Picture, ByRef identCols() As Integer, ByRef identRows() As Integer)
  Dim i As Integer
  Dim j As Integer
  Dim surf1 As RGBSurface
  Dim surf2 As RGBSurface
  
  if (pic1.Width = pic2.Width) and (pic1.Height = pic2.Height) then
    
    surf1 = pic1.RGBSurface
    surf2 = pic2.RGBSurface
    
    ' find matching columns
    
    i = 0
    while i < pic1.Width
      j = 0
      while (j < pic1.Height) and (surf1.Pixel(i, j) = surf2.Pixel(i, j))
        j = j + 1
      wend
      
      if j >= pic1.Height then
        identCols.Append i
      end if
      
      i = i + 1
    wend
    
    ' find matching rows
    
    i = 0
    while i < pic1.Height
      j = 0
      while (j < pic1.Width) and (surf1.Pixel(j, i) = surf2.Pixel(j, i))
        j = j + 1
      wend
      
      if j >= pic1.Width then
        identRows.Append i
      end if
      
      i = i + 1
    wend
    
  end if
End Sub

To use, simply call:

findIdenticalColsAndRows(pic1, pic2, identCols, identRows)

The matching columns and rows are respectively stored in the identCols and identRows arrays. Perhaps there are ways to speed up this algorithm even more?

Thanks a lot, you two, but I needed a way to really compare complete lines (sometimes I check for match, sometimes for unmatch so I need to be certain that’s workable and fast

I seem to have figured out a way of doing this now, but something is strange:

When I compare strings like written above, the results for my test picture are row 33 and 474.

In my modified method, I use memoryblocks instead of strings to fill with the RGB values.
In order to compare them easily, the method (which only delivers one row or column, the compare is done elsewhere) converts the memory block to a string before returning it.
In other words:

dim currentline  as  new memoryblock(dimension)


for q = 0 to dimension
  currentline.ColorValue(q,16)= rgbs.pixel(x,y)
  x=x+deltax
  y=y+deltay
next

return str(currentline)

This is incredibly fast compared to the string method. With a simple timer it took 18 ticks on a small test image, now the result gets delivered after 0–1 ticks.

But: The results are now 31 and 482. How come there’s a different result when comparing differently encoded results?
(Feel free to inform me on wrong coding, which is the most probable source, I guess).

Remember pictures are not precise files. Sometimes just loading and saving a graphic will alter it ever so slightly. So while two pictures may LOOK identical to you… in strict mathematical terms they are no where close.

If I’m not mistaken the findIdenticalColsAndRows() method do compare complete lines. For the columns it starts comparing the pixels from the top to the bottom, but exits immediately if two pixels aren’t equal (e.g. there is no need to compare the rest of the line, if the first pixel of the line doesn’t match… this saves some processing time)… but perhaps I’m misunderstanding the requirements for the comparison?

One could potentially implement a fault tolerance when comparing two pixels. E.g. if the difference between the two pixels are very slight, it still takes it as a match.

One of the most potent methods to compare images is the use of Neural Networks.

See http://www.codeproject.com/Articles/15949/Hopfield-model-of-neural-network-for-pattern-recog

Thank you all, but no need to get so complicated. This is basically for a small tool that’s intended to help find (real) similarities in button-style graphic objects so one hopefully can get a method and property that allow for true scalable buttons – meaning the border will not get scaled while the interior does so.

For this, Dave, I checked on instance of an imported picture for similarities in its rows. While you’re right with respect to compression artifacts and color profiles, your consideration should not be relevant in this case, I guess.

And sure, Alwyn, the method above does so, nut I needed to extract the lines fast. This seems to be finde now, and on test images not suited for this tool (only to see if it terminates correctly) that used to hold my Mac breathless for one and a half minutes, this is now done in about ten seconds. I still don’t know why the results differ, but from examining those lins in Photoshop the comparison seems to be “more correct” than the one using strings. Whyever!

I don’t think Neural Networks would help in this case. There is no reference model to train the network, and I don’t see how you would implement a unsupervised training either: it can be done with a number of pairs of images, but there is no underlaying pattern that the model can figure out.

Julen

I thought originally this was a straight up image comparison but I see the use case is too specific.

Had it been about image comparison I’d have mentioned I recently found some python code I wanted to port in the future that is used by a program to compare images and find possible matches:

https://code.google.com/p/comictagger/source/browse/trunk/comictaggerlib/imagehasher.py

This algorithm compares multiple images and ranks them by “proximity”. In the program it’s used to find the proper “Cover” for a specific comic magazine out of multiple results, when the only reference is an image file.

It implements this algorithm, called “Perceptual Hash”:

http://syntaxcandy.blogspot.com/2012/08/perceptual-hash.html

Thanks to you all, especially to Julen and Alwyn for setting me on the right track. I redesigned the structure and made it not obligatory anymore to have complete rows or lines available. Instead, I simply compare as you suggested on pixel base now and leave the loop if pixels don’t match or when I scanned the whole line.

This time I could see the first version – comparing strings – gave correct results, while using a memory block and converting it to a string, although much faster, was not exactly on point. I suspect the error in converting the block to a string.

Not relevant, because not filling up a memory block anymore gives a speed factor of 100 on bigger pictures. Even the “wrong” test picture, quite big, which needs to have every pixel scanned at least twice, takes now only 2 secs to scan where it had been about 90 initially.

Thanks a lot again, you all.

You’re welcome Ulrich. It is good to hear that the picture comparisons are now faster.

Just out of interest, what are the dimensions of your pictures?

2 seconds still sounds like a very long time to compare the pictures.

Your memoryblock code is flawed. Your ColorValue is 2 bytes, but your offset is moving by one each time. You’re overlaying 1 byte of the previous value each time through the loop.

dim currentline as new memoryblock(dimension*2)

for q = 0 to dimension*2 step 2
currentline.ColorValue(q,16)= rgbs.pixel(x,y)
x=x+deltax
y=y+deltay
next

return str(currentline)

Once gain thank you, Alwyn and Tim. Excellent, so I know the reason for the memory block giving wrong results.
The memory block is not needed anymore, but I try to keep that information in mind.

And Alwyn: This is for a photo with roughly 2500 x 3300 pixels which makes no sense to check with my tool. It was just to see how the program behaves with pictures where it cannot find any repeating patterns. For real life applications there is as good as no processing time. But if this should work like intended, I’m gonna put it incl. source text into public domain anyway – after all, this is meant to create real scalable buttons and similar image objects, which maybe can help in designing windows elements –, and, being still quite a newbie to Xojo I’m sure there’ll be a lot of improvements to be found.