I gave up on fixing it many posts ago, but if people cite me, why should I keep quiet and not respond them?
Is there any way, using String, to get the answer I would expect (1) for each of Garry’s cases? Or if I wanted to return the leftmost character of both Strings? Left(1) won’t work, would anything?
The Characters iterator is the only way. Or convert to Text which handles it differently.
I guess CountFields("") would probably work too, at least for the length.
I use Text in my projects, but I’m curious about solutions that don’t use deprecated features.
I’m just passing along solutions from those who cannot post anymore. Your best bet for a non-deprecated fix is probably going to be a feedback case. Sorry if I’ve distracted.
Thanks Tim, no worries.
dim s1 as string = "añb"
dim i as integer = s1.indexof("b")
dim s2 as string = s1.middle(i-1,1)
break
Nice
Or the Characters iterator, as mentioned above.
Even funnier
dim s1 as string = "añb"
s1 = s1.ConvertEncoding(Encodings.UTF32)
dim i as integer = s1.indexof("b")
dim s2 as string = s1.middle(i-1,1)
break
Here’s some very rough (barely tested) code that implements grapheme cluster friendly string manipulation methods as extensions to the string data type.
Please be aware that they will be slower than normal string functions due to the use of the Characters iterator. I believe they could be made faster if Xojo implemented them directly within the framework using the OS / ICU string C functions but even those require some kind of character iteration so would never be as fast as the current Xojo string functions.
Public Function CharacterLength(Extends pString As String) As Integer
#Pragma DisableBackgroundTasks
#Pragma DisableBoundsChecking
#Pragma StackOverflowChecking False
Dim theResult As Integer
theResult = 0
For Each char As String In pString.Characters
theResult = theResult + 1
Next
Return theResult
End Function
Public Function CharacterLeft(Extends pString As String, pLength As Integer) As String
#Pragma DisableBackgroundTasks
#Pragma DisableBoundsChecking
#Pragma StackOverflowChecking False
Dim theResult(-1) As String
If pLength > 0 Then
For Each char As String In pString.Characters
theResult.Append(char)
pLength = pLength - 1
If pLength = 0 Then
Exit For
End If
Next
End If
Return Join(theResult, "")
End Function
Public Function CharacterSplit(Extends pString As String) As String()
#Pragma DisableBackgroundTasks
#Pragma DisableBoundsChecking
#Pragma StackOverflowChecking False
Dim theResult(-1) As String
For Each char As String In pString.Characters
theResult.Append(char)
Next
Return theResult
End Function
Public Function CharacterRight(Extends pString As String, pLength As Integer) As String
#Pragma DisableBackgroundTasks
#Pragma DisableBoundsChecking
#Pragma StackOverflowChecking False
Dim theResult(-1) As String
Dim charArray(-1) As String
Dim count, i As Int32
If pLength > 0 Then
charArray = pString.CharacterSplit
count = UBound(charArray)
i = Max(count - (pLength - 1), 0)
While i <= count
theResult.Append(charArray(i))
i = i + 1
Wend
End If
Return Join(theResult, "")
End Function
Public Function CharacterMiddle(Extends pString As String, pStart As Integer, Optional pLength As Integer = -1) As String
#Pragma DisableBackgroundTasks
#Pragma DisableBoundsChecking
#Pragma StackOverflowChecking False
Dim theResult(-1) As String
Dim charArray(-1) As String
Dim count, i As Int32
If (pLength = -1) Or (pLength > 0) Then
charArray = pString.CharacterSplit
i = pStart
If pLength = -1 Then
count = UBound(charArray)
Else
count = Min(i + (pLength - 1), UBound(charArray))
End If
While i <= count
theResult.Append(charArray(i))
i = i + 1
Wend
End If
Return Join(theResult, "")
End Function
Public Function CharacterIndexOf(Extends pString As String, Optional pStart As Integer = 1, Optional pFind As String, Optional pOptions As ComparisonOptions = ComparisonOptions.CaseInsensitive, Optional pLocale As Locale) As Integer
#Pragma DisableBackgroundTasks
#Pragma DisableBoundsChecking
#Pragma StackOverflowChecking False
Dim theResult As Integer
Dim charIndex As Integer
Dim findCharArray(-1) As String
Dim findCharArrayUBound As Integer
Dim findCharArrayIndex As Integer
theResult = -1
If Len(pFind) > 0 Then
findCharArray = pFind.CharacterSplit
findCharArrayUBound = UBound(findCharArray)
findCharArrayIndex = 0
charIndex = 0
For Each char As String In pString.Characters
If charIndex >= pStart Then
If char.Compare(findCharArray(findCharArrayIndex), pOptions, pLocale) = 0 Then
If findCharArrayIndex = 0 Then
theResult = charIndex
End If
findCharArrayIndex = findCharArrayIndex + 1
If findCharArrayIndex > findCharArrayUBound Then
Exit For
End If
Else
theResult = -1
findCharArrayIndex = 0
End If
End If
charIndex = charIndex + 1
Next
Else
theResult = pStart
End If
Return theResult
End Function
Public Function CharacterIndexOf(Extends pString As String, pFind As String, Optional pOptions As ComparisonOptions = ComparisonOptions.CaseInsensitive, Optional pLocale As Locale) As Integer
Return pString.CharacterIndexOf(-1, pFind, pOptions, pLocale)
End Function
If Text has a feature that String does not and you’d like to see String support said feature, please submit a feature request.
Thanks for the heated response everyone. Whilst Text
would “solve” the problem it’s deprecated and so that’s a no go for me.
I have submitted a feature request for a new method on String
(CharacterCount
).
In the meantime, I’m using an extension on the String
class in a module:
Function CharacterCount(Extends s As String) As Integer
Var count As Integer = 0
For Each c As String In s.Characters
count = count + 1
Next c
Return count
End Function
It’s definitely less than ideal as it’s really slow (which is not good when you’re trying to write a code editor). I’m sure it could be made faster if Xojo simply returned the iterator count from String.Characters
directly from the framework.
I don’t get this why feature request would be needed. Len has always been supposed to return number of Characters? And LenB the number of Bytes. Is there something I am missing here ?
Sounds more to me like Bug report would be the correct thing.
It’s a technical difference since code points and characters are the same in most use cases. Where they are not, the String functions deal with code points, and that’s where the confusion arises.
My solution would be to de-deprecate Text for those who need proper text handling without worrying about the underlying storage issues, but that’s not getting much traction.
I see this is not limited to Xojo:
I actually knew that at one point and had forgotten. Thanks for the refresher.
It’s been a long time since I looked at this so I could be wrong but I have a feeling that the only way to determine the count is to iterate the characters.
That’s not to say that a native Xojo implementation couldn’t be faster as implementing it in C would have less overhead.
Add Java, Google Dart, PHP and I think, Objective C to that list.