Delete Duplicate Rows in a ListBox

Emile_Schwarz · October 30, 2014, 8:48am

I want to implement a Delete duplicate Rows in a ListBox.

I am unsure of what to do.

Heres the pseudo code I have in mind:

Do a global scan in a main loop
a. Store one Row Contents in RowSource --> LB.Cell(LoopIdx,-1)
That loop uses DownTo 0

b. Add an inside loop (sub loop) to scan the whole ListBox
Compare RowSource contents to LB.Cell(LoopIdx - 1,-1)
If I found a duplicate, delete it
and remove 1 to LoopIdx

What is your feeling ?

Heres the code I actually wrote:

[code] Dim RowCnt As Integer // Number of Rows in the ListBox
Dim LoopIdx As Integer // Main Loop Indice
Dim DbleIdx As Integer // Sub Loop Indice
Dim RawSource As String // Row contents to compare to the rest of the ListBox contents

// Get the number of rows in the ListBox
RowCnt = LB_Bookmarks.ListCount // 1-based value

// Loop thru the ListBox contents
For LoopIdx = RowCnt DownTo 1
// Get One Row Contents (to compare to the other Rows)
RawSource = LB.Cell(LoopIdx - 1,-1)

// Scan the ListBox for a duplicate
For DbleIdx = 1 to LoopIdx - 1
  If RawSource = LB.Cell(DbleIdx,-1) Then
    // I found a duplicate, so delete it
    LB.RemoveRow LoopIdx - 1
    
    // Remove 1 to LoopIdx
    LoopIdx = LoopIdx - 1 // What can be the impact on the main loop ?
  End If
Next

// To avoid 1, Infinite Loop
If UserCancelled Then Exit

Next[/code]

The code above seems to work. Not placed in the correct project, but it works.

BTW: after a simple check, it seems that TWO instances of the same data stays on the end of the process: two Rows with the same contents and I do not make a real compare between the original contents and the resulting contents. I only checked the results.

Worst, I forgot that the file I used as source to delete duplicates is a bit bad: Column 2 data does not always correspond to Column 1 data on a Row basis. *

So, I have something wrong, but how do I correct that ?

That source file was created at an early stage of the project when I had bad storage for unknown reasons (Column 1 have an URL, and Column 2 have that url Title). Sometimes I had URL with another Title (a previous one it seems).

Albin_Kiland · October 30, 2014, 8:56am

An alternative way to do it might be to put the listbox rows into a Dictionary as keys and then back to the Listbox.
A Dictionary can’t hold duplicate keys so any duplicate will be removed.
Have not tried this though

Emile_Schwarz · October 30, 2014, 9:54am

Albin,

thank you for the idea. I love it ! I will try it.

Axel_Schneider · October 31, 2014, 2:36am

  Dim source As Listbox = lb // your ListBox Name
  Dim i, j As Integer
  For i = 0 To source.ListCount-1
    For j = i + 1 To source.ListCount-1
      If source.Cell(i, -1) = source.Cell(j, -1) Then
        source.RemoveRow(j)
      End If
    Next
  Next

Robert_Cheek · December 28, 2017, 7:32pm

Axel,
That routine for removing duplicates worked very well for a combo box also. I used it to remove duplicates where a user may have typed the same thing as was in the List.

Thanks,
Bob - N4RFC

Peter_Job · December 28, 2017, 7:47pm

[quote=140083:@Axel Schneider]Dim source As Listbox = lb // your ListBox Name Dim i, j As Integer For i = 0 To source.ListCount-1 For j = i + 1 To source.ListCount-1 If source.Cell(i, -1) = source.Cell(j, -1) Then source.RemoveRow(j) End If Next Next[/quote]

Maybe it is late and I am tired, but if you remove a row, are your i and j counters not going to then go past the end of the rows as the listcount reduces?

Tim_Hare · December 28, 2017, 7:50pm

No, because source.ListCount-1 is evaluated each time through the loop. But it will skip rows. It will work for a single duplicate, but may miss multiples.

DaveS · December 28, 2017, 8:05pm

when removing items for any type of list… always go in REVERSE

 Dim source As Listbox = lb // your ListBox Name
  Dim i, j As Integer
  For i = source.ListCount-1 downto 0
    For j = source.ListCount-1 downto i+1
      If source.Cell(i, -1) = source.Cell(j, -1) Then
        source.RemoveRow(j)
      End If
    Next
  Next

will NOT skip rows, will NOT miss multiple duplicates

PaulS · December 28, 2017, 8:29pm

@Dave S
I believe that the j-for-loop isnt going to work.
I think the code should be

For j = source.ListCount-2 downto 0

AlbertoD · December 28, 2017, 8:47pm

Paul, I was looking at it, but I think that j will not start until it has a value higher than i+1. For example:
source.ListCount = 10
For i = 9 downto 0
If i = 9, then i+1 = 10, so
For j = 9 downto 10, I think this for will not execute.
Then i = 8, i+1 = 9
For j = 9 downto 9, this for will be executed 1 time, comparing source.Cell(8, -1) = souce.Cell(9, -1)
and if they are equal, remove source.RemoveRow(9)

So I think the code is right (I’m remembering/learning how to code).

Or if you want to avoid the first For j to not run, then:

Dim source As Listbox = lb // your ListBox Name Dim i, j As Integer For i = source.ListCount-2 downto 0 For j = source.ListCount-1 downto i+1 If source.Cell(i, -1) = source.Cell(j, -1) Then source.RemoveRow(j) End If Next Next

I hope I’m right.

PaulS · December 28, 2017, 8:59pm

If I = 9 downto 0
The i loop starts with 9 so you want to check out j from 8 to 0 for duplicates.
That’s why I think the j for loop has to be For j = source.ListCount-2 downto 0.

Just my 2 cents.
[edit]
In fact I even believe that the i loop should be For i = source.ListCount-1 downto 1
Because the last one can never be a duplicate.
[/edit]

AlbertoD · December 28, 2017, 9:46pm

Paul, I did this test:

Dim i,j As Integer Dim sourceListcount As Integer = 10 For i = sourceListcount-1 DownTo 0 For j = sourceListcount-1 DownTo i+1 MsgBox(i.ToText + " " + j.ToText) // Then changed to: Listbox1.AddRow(i.ToText,j.ToText) Next Next

The code is executed and the first ‘i’ reported is 8 and the first ‘j’ reported is 9.
So when i = 9 then j = 9 DownTo 10, the code inside the For j … Next is not executed.
Once i =8 and j= 9 DownTo 9, the code is executed once (i=8, j=9)

Of course, this example is not deleting anything, so it is run as it were no duplicates.

scott_boss · December 28, 2017, 9:58pm

I know this is way off topic but please excuse me.

I have a method I use to add/update rows of the listbox. I use a RowTag using a unique ID to determine which row is which. Then whenever I want to add (or update) data, i call that method. I know that I wont have duplicate data as I would have rows with duplicate RowTags with duplicate Unique IDs. And since I walk the list of RowTags looking for one that matches before doing a true add row. If I find it in the RowTags, I update that row with the new data over adding a new row. That way I can not have the deduplicates.

I also have a method to remove a row based on the RowTag Unique ID. And if the Unique ID doesnt exist on any rows RowTags then I dont do anything. Its already gone.

This way you dont have to run through the #s rows over and over again to determine if there is duplicates.

But now I will return you back to your normal thread/discussion.
sb

Peter_Job · December 28, 2017, 10:41pm

I did not realise that, thanks.

Wayne_Golding · December 29, 2017, 4:15am

That is not the way I read it. The i loop iterates the listbox from bottom to top. The j loop removes any equivalents from the bottom to i + 1 thus keeping the first entry for each potential duplicate.