Find duplicate in listbox

Hi,

I want to find duplicate rows in a listbox, but based from multi-column value,

Dim source As Listbox = Listbox1 Dim ix, jx As Integer For ix = source.ListCount-1 DownTo 0 For jx = ix + 1 To source.ListCount-1 If source.Cell(ix,0) = source.Cell(jx,0) then if source.Cell(ix,1) = source.Cell(jx,1) then if source.Cell(ix,2) = source.Cell(jx,2) then if source.Cell(ix,3) = source.Cell(jx,3) Then Listbox1.cell(jx,1)="duplicate" end if end if end if end if Next Next

when I do running the code, no error found but not executed.

any helps?

thanks
regards,
arief

in debug mode you can use break points and step through.
i would use a method to compare 2 rows and 2 cells. its better for debugging.

Have a look at the inner 4 ifs and try to figure out where your values go into the “else”. Which you should add. Best try that with just a row or 2 in the listbox. Then please combine the inner 4 ifs to one line.

I love a morning puzzle. This might do the trick:

Public Sub FindDuplicates(lbSource as Listbox)
  dim iMaxRow as Integer = (lbSource.ListCount - 2)
  dim iMaxCol as Integer = (lbSource.ColumnCount - 1)
  
  for iRow as Integer = 0 to iMaxRow
    dim iNextRow as Integer = iRow + 1
    
    for iColumn as Integer = 0 to iMaxCol
      if lbSource.Cell(iRow, iColumn) <> lbSource.Cell(iNextRow, iColumn) then
        // Not a match, check the next row
        continue for iRow
        
      end
      
    next
    
    // If we reach this point, the row is a duplicate
    lbSource.Cell(iNextRow, 1) = "Duplicate"
    
  next iRow
End Sub

Dear mr. Parnell,

The code is working, but its a bit different with my goal.

What I need is, to find duplicate value based from 4 columns,
and the column name (Date, Time, ID, and product code)

the duplicated criteria should based from that column exactly.

thanks

regards,
arief

dim iMaxCol as Integer = 4

Dear Ms. Willius,

Thanks, it works, but with the condition, If First Row and next rows have the same value.
The code doesn’t work if the duplicate rows in scramble position.

[code]2019-11-28 10:09:02 001 AW001
2019-11-28 10:09:03 001 AW002
2019-11-28 11:06:00 002 AW003
2019-11-28 10:09:02 001 AW001
2019-11-28 10:09:03 001 AW002

this will not work,[/code]

[code]2019-11-28 10:09:02 001 AW001
2019-11-28 10:09:02 001 AW001
2019-11-28 10:09:03 001 AW002
2019-11-28 10:09:03 001 AW002
2019-11-28 11:06:00 002 AW003

when I do sorting in Time Header, then it works[/code]

thanks
regards,
arief

I mostly put an object-references in the rowtag of each row and use these objects for business and the listbox itself as just the view for the user. This makes live a lot easier and in some cases much faster.

To resolve this kind of troubles, I sorted my Listbox before applying the search duplicates. But that project was data base driven and I was able to sort more than one column.

The usual way to remove duplicates is

  1. Sort
  2. Iterate through rows and remove duplicates

To do it with the value of 4 columns (or more), concatenate their value in a hidden column, and sort on that column, then scan all rows and remove duplicates.

Chances are sort is way faster than trying to do it in code.

myArray.Sort ? I am not so happy about business-functionality left over to the listbox control which is just GUI.

No. ListBox.Sort.

http://documentation.xojo.com/api/deprecated/listbox.html#listbox-sort

I believe concatenating the columns content into a hidden column and sorting on it should be faster than trying to deduplicate in code.

Then cycle through rows, and if a row is identical to the previous one, delete it.

[quote=465164:@Arief Sarjono]

[code]2019-11-28 10:09:02 001 AW001
2019-11-28 10:09:03 001 AW002
2019-11-28 11:06:00 002 AW003
2019-11-28 10:09:02 001 AW001
2019-11-28 10:09:03 001 AW002

this will not work,[/code]

I created a simple app with a listbox and populated it with these data and ran your original code.

It DID work! However, I did not use a header in listbox.

I believe that your header is somehow getting in the way. Make sure that your code is not treating the header line as data and that your For…Next loops are starting and ending as you intend.

I thought that too, listboxes are just for viewing but I found sorting a 44,000 row listbox is much faster than any other method of sorting including arrays and SQL and requires just one line of code. I don’t know how they do it but it’s incredibly fast and shouldn’t be discounted as a viable solution based on the result.

I like using dictionaries for things like this. Works for any number of columns.

Public Sub FindDupeRows(lb as listbox) dim d as new dictionary dim s as string dim dupes() as integer dim u as integer = lb.ListCount-1 for r as integer = 0 to u s="" for c as integer = 0 to lb.ColumnCount-1 s=s+lb.cell(r,c) +"|" next if d.haskey(s) then dupes.Append r else d.value(s)=r end if next End Sub

The | delimiter prevents cases where a single column of “1234” would match columns of “1”,“2”,“3”,“4”

Now that you have an array of dupes, you can remove them.

for r as integer = ubound(dupes) downto 0 listbox1.RemoveRow dupes(r) next

Cool…

Thanks for all the helps…

its working now…

regards,
Arief