var chars() as string = s.Split( "" )
var upperChars() as string = s.Uppercase.Split( "" )
for index as integer = 0 to chars.LastIndex
if chars( index ).Asc = upperChars( index ).Asc then
// uppercase
else
// lowercase
end if
next
Private Function IsLowercase(value As String) As Boolean
Return value.Asc = value.Lowercase.Asc
End Function
instead of
Private Function IsLowercase(value As String) As Boolean
Var iResult As Integer = value.Compare(value.Lowercase, ComparisonOptions.CaseSensitive)
Return iResult = 0
End Function
makes in my test case a difference of 20 ms. Further suggestions for optimization?
But in thinking about it more, where speed is important, you are calling the Lowercase function for each character of the string instead of just once on the whole thing. If your string is 100 character, it probably doesnât make a difference. If itâs 100k, you might feel that.
In other words, instead of converting each character to lowercase before the comparison, convert the string to lowercase, then compare each character to the corresponding character of the original.
And after reading through here, I was interested in how NSStringCompareMBS would preform, if included as another test case (based on the code in the original post).
So I added the following two test runs:
dblSeconds = Xojo.Core.Date.Now.SecondsFrom1970
intRounds = 0
Do Until intRounds = 10000
intRounds = intRounds + 1
If NSStringCompareMBS(strTest1, strTest2, 0) <> 0 Then
Break
End If
Loop
strMessage = strMessage + Chr(13) + Str(Xojo.Core.Date.Now.SecondsFrom1970 - dblSeconds) + " seconds for 10,000 NSStringCompareMBS case-sensitive string comparisons"
dblSeconds = Xojo.Core.Date.Now.SecondsFrom1970
intRounds = 0
Do Until intRounds = 10000
intRounds = intRounds + 1
If NSStringCompareMBS(strTest1, strTest2, 1) <> 0 Then
Break
End If
Loop
strMessage = strMessage + Chr(13) + Str(Xojo.Core.Date.Now.SecondsFrom1970 - dblSeconds) + " seconds for 10,000 NSStringCompareMBS case-insensitive string comparisons"
And got the following results:
0.0206921 seconds for 10,000 String.Compare
0.0049689 seconds for 10,000 HexEncoding comparisons
0.5892110 seconds for 10,000 Hashing comparisons
0.0012398 seconds for 10,000 case-insensitive string comparisons
0.0022290 seconds for 10,000 NSStringCompareMBS case-sensitive string comparisons
0.0021710 seconds for 10,000 NSStringCompareMBS case-insensitive string comparisons
Note: I included the results of all tests, because Iâm using Xojo 2021r1.1 on a 2018 Mac Mini (10.15.7) with 3.2 GHz 6-Core Intel Core i7 & 32Gb RAM.
My conclusion was, for case-insensitive string comparisons use the = or <> operators. And for case-sensitive matches, use NSStringCompareMBS - if available to you and appropriate.
There must be a way to efficiently use memory blocks for this. Albeit I canât think of one at the moment. I use memoryblocks for case sensitive âselect caseâ.
For String.Compare I made a feedback case 64647 as it converts all Strings to Text and then does a compare, which makes it slower than it needs to be. And on macOS the compare may be with creating CFString internally (another copy in addition to text) to do the compare.
For NSStringCompareMBS similarly you have the overhead of a plugin function call, which is not efficient as it could be. (see 62010). And then our plugin will do CFString/NString comparison for you.
String.Compare may be slower than other methods, but in one project it was the only way for me to get a decent ordering of a ListBox containing âUmlauteâ. (StrComp didnât work properly.) I donât know how else I could have it done so I am glad I have this option.
In a nutshell, some characters can be represented in two different ways: one code point that represents the character, or a series of two or more code points.
To see this in action in Xojo, try this:
MessageBox "e" + &u0300
(I am greatly simplifying the issue here.)
Normalization is the process of getting all the character represented in the same way, either Composed (one code point) or Decomposed (two or more code points). Once you have achieved that consistency, things like sorts and searches will work properly.
My M_String project include normalization code, as does the MBS plugins.