I can think of three ways of testing for equality between strings with case sensitivity. They are using the String.Compare function, an encoding such as EncodeHex, or a hashing such as SHA256. (This is because a Xojo string remembers case but doesn’t operate on them with case sensitivity.) The EncodeHex method seems best, with the following results:
6.372 seconds for 100,000 String.Compare
0.0220001 seconds for 100,000 HexEncoding comparisons
1.605 seconds for 100,000 Hashing comparisons
0.01 seconds for 100,000 case-insensitive string comparisons
These results were obtained after running the following code:
dim strTest1 as string = "Test"
dim strTest2 as string = "Test"
dim intRounds as integer
dim strMessage as string
dim dblSeconds as double =xojo.core.date.Now.SecondsFrom1970
do until intRounds = 10000
intRounds = intRounds + 1
if strTest1.Compare(strTest2, ComparisonOptions.CaseSensitive) <> 0 then
break
end if
loop
strMessage = str(xojo.core.date.Now.SecondsFrom1970 - dblSeconds) + " seconds for 100,000 String.Compare"
dblSeconds =xojo.core.date.Now.SecondsFrom1970
intRounds = 0
do until intRounds = 10000
intRounds = intRounds + 1
If EncodeHex(strTest1) <> EncodeHex(strTest2) then
break
end if
loop
strMessage = strMessage + chr(13) + str(xojo.core.date.Now.SecondsFrom1970 - dblSeconds) + " seconds for 100,000 HexEncoding comparisons"
dblSeconds =xojo.core.date.Now.SecondsFrom1970
intRounds = 0
do until intRounds = 10000
intRounds = intRounds + 1
If crypto.SHA512(strTest1) <> crypto.SHA512(strTest2) then
break
end if
loop
strMessage = strMessage + chr(13) + str(xojo.core.date.Now.SecondsFrom1970 - dblSeconds) + " seconds for 100,000 Hashing comparisons"
dblSeconds =xojo.core.date.Now.SecondsFrom1970
intRounds = 0
do until intRounds = 10000
intRounds = intRounds + 1
If strTest1 <> strTest2 then
break
end if
loop
strMessage = strMessage + chr(13) + str(xojo.core.date.Now.SecondsFrom1970 - dblSeconds) + " seconds for 100,000 case-insensitive string comparisons"
TextArea1.Text = strMessage
EncodeHex ignores string encoding and operates directly on the bytes of the string. String.Compare has to evaluate the bytes in light of the encoding, so it will take longer. I’m a little surprised by your results of a straight string comparison. Seems like it shouldn’t be the fastest.
Using EncodeHex with Extends also allows you to add the encoding to a string more easily to produce more readable code in the usual manner like this which cannot be done with String.Compare:
If strTest1.EncodeAsHex = strTest2.EncodeAsHex then
The Module’s method extends strings like this:
Function EncodeAsHex(Extends str As String) As string
return EncodeHex(str)
End Function
(Although for some reason, in our copy of 2020r3.2 I had to toggle the Private and Public Scopes in the Method’s menu before it was recognised in other parts of the project.)
I didn’t even think to check StrComp as I thought it would be about the same as Compare. This could use an explanation from the engineers as it what it’s doing differently.
Checking it in Instruments (which is a fantastic tool, by the way) it looks like most of the time is being wasted doing a Runtime Stack Check inside the TextEncoding.OperatorCompare function:
Edit to add: Correction, about 60% of the time is wasted in the stack check, but even with that removed, RuntimeCompareTextWithOptions still looks rather slow…
That’s a pretty interesting topic. Within my App I have a routine which checks the letters of every single String for Upper-/Lowercase to prepare them for another process. I have the following function:
Private Function IsLowercase(value As String) As Boolean
Var iResult As Integer = value.Compare(value.Lowercase, ComparisonOptions.CaseSensitive)
Return iResult = 0
End Function
Yeah, this function will use some time for large String to proceed each single character within a loop. Since it looks like String.Compare is not well optimized in this topic, do you have another suggestion to speed it up by not using String.Compare?