SortWith behaviour with non-unique elements in the base array

I tested this as follows:

[code]dim colours() as string = array (“red”, “blue”, “green”, “red”, “red”, “green”, “blue”)
dim values() As integer = array (0, 1, 2, 3, 4, 5, 6)

app.AutoQuit = true
colours.SortWith (values)
msgbox (“Done”)

quit[/code]

And examining the colours and values array afterwards and the sort is done as I would wish - the elements of the same colour are grouped together (the order within each group doesn’t matter to me). However, the SortWith documentation appears to imply that, in cases such as the above where the base array (colours in this case) has non-unique values, the results are undefined. Does this mean that the actual behaviour of SortWith in this and similar cases may change from release to release? And if so, why?

I think that is very badly worded…
My take is

which makes sense…

dim  colours() as string = array ("red", "red", "red", "red", "red", "red", "red")
dim  values() As integer = array (0, 1, 2, 3, 4, 5, 6)

just like now, when the order within each “group”
in your example, the “red” are together, the "green"s and “blue” are grouped, but within each group the order is “whatever”
if all the base values were the same, it would be ONE group

so I think the Lang Ref needs to have the word “NOT” removed… or a much better explanation if what I just said was totally in left field

My interpretation is that the sequence of values WITHIN the same “group” is undefined. In other words, there is no secondary sort value. So don’t depend on those remaining in any particular order such as lowest original index number. The secondary sequence should be considered arbitrary and pseudo-random.

Which would be just fine with me. The online doc says:

It’s that it says “such as” which got my attention, because the base array in my example contains duplicates, that is, not unique values.

Base array items that have duplicate values will lead to the items in the associated arrays not being in a well defined order
They may not be sorted in the same order every time

The worst case would of course be having a base array that has entirely duplicate values

FYI, an algorithm that maintains the order of like values within the sorted result is called a “stable sort”.

http://www.geeksforgeeks.org/stability-in-sorting-algorithms/

sort and sortwith are not necessarily stable

[quote=340967:@Norman Palardy]Base array items that have duplicate values will lead to the items in the associated arrays not being in a well defined order
They may not be sorted in the same order every time[/quote]

This is entirely satisfactory for my use case, provided that what you describe as the behaviour is stable.

This could occur in my use case, or the base items could all be unique. It just depends what the user does.

[quote=340985:@Tim Streater]This is entirely satisfactory for my use case, provided that what you describe as the behaviour is stable.
[/quote]
stable in what sense ?
that the associated arrays are in the same order every time ? thats not assured.
or that the base array is sorted ? that is assured.

then the associated arrays may be in any order - the base will be “sorted” (but its hard to tell when all the items are the same value)

all unique is fine as they will sort as expected

I just did a test on array of 1M strings. When initially reverse sorted, sort takes about 300 ms. When initially sorted, it takes about 250 ms. When all the same, it takes 5 ms, so there is some optimization there.

When initially shuffled, it takes over 3 seconds, which was a surprise.

thats interesting but thats not what Tim asked

Provided that the non-base arrays always sort in the same way, even if that doesn’t repeat the next time those non-base arrays are sorted, then I’m happy. So, if I have a base array and two non-base arrays, that is I have Ubound triplets, I want the triplets to still be the same after the SortWith. Otherwise my data is destroyed.

Hope that makes sense.

sort with wont add or remove rows if thats what you mean

from your first post it sorts all your “reds” together
So if you imagine your data is initially like this (in rows)

 "red", 0
 "blue", 1
 "green", 2
 "red", 3
 "red", 4
 "green", 5
 "blue", 6

after sort with you will get all the “reds” together, all the “greens” together and all the “blues” together - and they will be sorted “blue” “green” “red” (sorted lexically or alphabetically)

what is NOT guaranteed is that you get the integers within those groups sorted in any specific order (ascending for instance)
in your example you have 2 blues and they could sort as “blue 6” then “blue 1” or “blue 1” then “blue 6”
ie

 "blue", 6
 "blue", 1
 "green", 2
 "green", 5
 "red", 3
 "red", 4
 "red", 0

or

 "blue", 1
 "blue", 6
 "green", 2
 "green", 5
 "red", 3
 "red", 4
 "red", 0

and similarly for the others

The associated array (in this case the values) is not necessarily in any sorted order within the groups in the base array

Yes, that’s no problem. But if I have more than one non-base array (extending my original example):

[code]dim colours() as string = array (“red”, “blue”, “green”, “red”, “red”, “green”, “blue”)
dim values() As integer = array (0, 1, 2, 3, 4, 5, 6)
dim others() as integer = array (10, 11, 12, 13, 14, 15, 16)

app.AutoQuit = true
colours.SortWith (values, others)
[/code]

Now I have a triplet (“red”, 0, 10) and another ("blue, 1, 11) and so on. What I want to ensure is that the triplets are not messed up by the sorting process, that is that the array values is sorted in the same way as the array others. Once sorted, all the “red” triplets will be sorted together - which order they are within “red” group doesn’t matter.

Also - what happens is the array lengths are not the same (not that mine will be). Does that give an exception or just more undefined behaviour?

To answer my own question, a quick test indicates an outofbounds exception.

Can the doc be updated to reflect all this?

[quote=341020:@Tim Streater]Yes, that’s no problem. But if I have more than one non-base array (extending my original example):

[code]dim colours() as string = array (“red”, “blue”, “green”, “red”, “red”, “green”, “blue”)
dim values() As integer = array (0, 1, 2, 3, 4, 5, 6)
dim others() as integer = array (10, 11, 12, 13, 14, 15, 16)

app.AutoQuit = true
colours.SortWith (values, others)
[/code]

Now I have a triplet (“red”, 0, 10) and another ("blue, 1, 11) and so on. What I want to ensure is that the triplets are not messed up by the sorting process, that is that the array values is sorted in the same way as the array others. Once sorted, all the “red” triplets will be sorted together - which order they are within “red” group doesn’t matter.
[/quote]
ALL of the extra arrays are sorted similarly.
Again if you conceive them as “rows” they are sorted by the color with the rest of the data in the “row” moving as the row moves.

[quote=341020:@Tim Streater]
Also - what happens is the array lengths are not the same (not that mine will be). Does that give an exception or just more undefined behaviour?[/quote]
You get bad things happening :slight_smile:
NEVER do this as I expect you’ll get exceptions

If SortWith is too limiting, you might also want to take a look at the option to use Sort with a delegate method.

Wow, after all these years, I never knew you could do this! Another reason to go through the docs and online reference material in my spare time.

Thanks!

That’s something that was introduced within the last year or two.

and several (many?) of us have overlooked it. me included.