Compare images for similarity

This is a long shot.
Given two images of (say) 100 x 100 pixels in black and white, is there an algorithm for comparison?
Almost like OCR except the shapes contained may not be actual letters… eg circles , stars, windings…

I need to do something very like OCR by taking a PDF with a grid of letters and glyphs, assemble a unique list of them, and then recognise where duplicates exist, even if they are not pixel-perfect copies.

It’s actually not that hard to do. You’ll need to do a threshold on the two to make sure you’ve got just black and white pixels and then loop through the two rgb surfaces and count the number of pixels that are the same and compare that against the total number of pixels. That’ll give you a percentage match. You’ll need to decide what an acceptable threshold is.

You may also want to add a general straightening algorithm.

I see what you’re driving at, but I doubt that would fully get me where I need to be.

Consider < and >

What about it? They’d have a very low match value

FWIW, I’ve done this. Sold an app for many years whose sole purpose was matching scanned letters to samples of rendered fonts. It was very fast and very accurate.

Counting pixels, they should be identical, surely?

No, you’re not matching the number of pixels, but the number of pixels that are an exact match. 10,20 of image A should be compared to 10,20 of image B. You’re not counting the number of blacks and the number of whites.

2 Likes

Thanks Thom. I didn’t explain that very well.

A poor mans OCR is to use a difference filter (which is supported by CoreGraphics). This turns matching pixels to black and non-matching to white, using grey for the in-betweens.

You can simply count the number of black pixels and if it meets a threshold, it’s a match.

Do’h!

:slight_smile:

We got GMImageMBS.Hash as String function on MBS Xojo GraphicsMagick Plugin to give a fingerprint for an image to compare to other images.

1 Like

That looks very interesting.
Thanks Christian.

I successfully use this:

Function picDifference(pPic1 As Picture, pPic2 As Picture) as Double
// Return -1 if pics sizes are not the same
If pPic1.Height <> pPic2.Height Or pPic1.Width <> pPic2.Width Then
	Return -1
End If

// Width and Height of input pics
Dim iInW As Integer = pPic1.Width
Dim iInH As Integer = pPic1.Height

// Scale down to speed things up
Dim iMax As Integer = 500

// Or don't if already smaller
If iInW <= iMax And iInH <= iMax Then
	iMax = Max(iInW, iInH)
End If

// Scale ratio
Dim dRatio As Double = min(iMax / iInW, iMax / iInH)

// Work pics
Dim pWrkPic1 As New Picture(iInW * dRatio, iInH * dRatio)
Dim pWrkPic2 As New Picture(iInW * dRatio, iInH * dRatio)

// Width and Height of work pics
Dim iWrkW As Integer = pWrkPic1.Width
Dim iWrkH As Integer = pWrkPic1.Height

// Do the scaling
pWrkPic1.graphics.DrawPicture(pPic1, 0, 0, iWrkW, iWrkH, 0, 0, iInW, iInH)
pWrkPic2.graphics.DrawPicture(pPic2, 0, 0, iWrkW, iWrkH, 0, 0, iInW, iInH)

// Compare pixels per line
Dim dDiff As Double
Dim cPx1, cPx2 As Color
For y As Integer = 0 To iWrkH - 1

For x As Integer = 0 To iWrkW - 1

cPx1 = pWrkPic1.Graphics.Pixel(x, y)
cPx2 = pWrkPic2.Graphics.Pixel(x, y)

dDiff = dDiff + ((Abs(cPx1.Red - cPx2.Red) + Abs(cPx1.Green - cPx2.Green) + Abs(cPx1.Blue - cPx2.Blue)) / 255)

Next x

Next y

dDiff = (100 * dDiff) / (iWrkW * iWrkH * 3)

Return dDiff

End Function