Is this a bug or expected behaviour?

I gave up on fixing it many posts ago, but if people cite me, why should I keep quiet and not respond them?

Is there any way, using String, to get the answer I would expect (1) for each of Garry’s cases? Or if I wanted to return the leftmost character of both Strings? Left(1) won’t work, would anything?

1 Like

The Characters iterator is the only way. Or convert to Text which handles it differently.

I guess CountFields("") would probably work too, at least for the length.

You can use Text

I use Text in my projects, but I’m curious about solutions that don’t use deprecated features.

1 Like

I’m just passing along solutions from those who cannot post anymore. Your best bet for a non-deprecated fix is probably going to be a feedback case. Sorry if I’ve distracted.

3 Likes

Thanks Tim, no worries. :slight_smile:

2 Likes
dim s1 as string = "añb"
dim i as integer = s1.indexof("b")
dim s2 as string = s1.middle(i-1,1)
break

Nice

2 Likes

Or the Characters iterator, as mentioned above.

Even funnier

dim s1 as string = "añb"
s1 = s1.ConvertEncoding(Encodings.UTF32)
dim i as integer = s1.indexof("b")
dim s2 as string = s1.middle(i-1,1)
break

Here’s some very rough (barely tested) code that implements grapheme cluster friendly string manipulation methods as extensions to the string data type.

Please be aware that they will be slower than normal string functions due to the use of the Characters iterator. I believe they could be made faster if Xojo implemented them directly within the framework using the OS / ICU string C functions but even those require some kind of character iteration so would never be as fast as the current Xojo string functions.

Public Function CharacterLength(Extends pString As String) As Integer
  #Pragma DisableBackgroundTasks
  #Pragma DisableBoundsChecking
  #Pragma StackOverflowChecking False
  
  Dim theResult As Integer
  
  theResult = 0
  
  For Each char As String In pString.Characters
    theResult = theResult + 1
  Next
  
  Return theResult
End Function
Public Function CharacterLeft(Extends pString As String, pLength As Integer) As String
  #Pragma DisableBackgroundTasks
  #Pragma DisableBoundsChecking
  #Pragma StackOverflowChecking False
  
  Dim theResult(-1) As String
  
  If pLength > 0 Then
    For Each char As String In pString.Characters
      theResult.Append(char)
      
      pLength = pLength - 1
      
      If pLength = 0 Then
        Exit For
      End If
    Next
  End If
  
  Return Join(theResult, "")
End Function
Public Function CharacterSplit(Extends pString As String) As String()
  #Pragma DisableBackgroundTasks
  #Pragma DisableBoundsChecking
  #Pragma StackOverflowChecking False
  
  Dim theResult(-1) As String
  
  For Each char As String In pString.Characters
    theResult.Append(char)
  Next
  
  Return theResult
End Function
Public Function CharacterRight(Extends pString As String, pLength As Integer) As String
  #Pragma DisableBackgroundTasks
  #Pragma DisableBoundsChecking
  #Pragma StackOverflowChecking False
  
  Dim theResult(-1) As String
  Dim charArray(-1) As String
  Dim count, i As Int32
  
  If pLength > 0 Then
    charArray = pString.CharacterSplit
    
    count = UBound(charArray)
    
    i = Max(count - (pLength - 1), 0)
    
    While i <= count
      theResult.Append(charArray(i))
      
      i = i + 1
    Wend
  End If
  
  Return Join(theResult, "")
End Function
Public Function CharacterMiddle(Extends pString As String, pStart As Integer, Optional pLength As Integer = -1) As String
  #Pragma DisableBackgroundTasks
  #Pragma DisableBoundsChecking
  #Pragma StackOverflowChecking False
  
  Dim theResult(-1) As String
  Dim charArray(-1) As String
  Dim count, i As Int32
  
  If (pLength = -1) Or (pLength > 0) Then
    charArray = pString.CharacterSplit
    
    i = pStart
    
    If pLength = -1 Then
      count = UBound(charArray)
    Else
      count = Min(i + (pLength - 1), UBound(charArray))
    End If
    
    While i <= count
      theResult.Append(charArray(i))
      
      i = i + 1
    Wend
  End If
  
  Return Join(theResult, "")
End Function
Public Function CharacterIndexOf(Extends pString As String, Optional pStart As Integer = 1, Optional pFind As String, Optional pOptions As ComparisonOptions = ComparisonOptions.CaseInsensitive, Optional pLocale As Locale) As Integer
  #Pragma DisableBackgroundTasks
  #Pragma DisableBoundsChecking
  #Pragma StackOverflowChecking False
  
  Dim theResult As Integer
  Dim charIndex As Integer
  Dim findCharArray(-1) As String
  Dim findCharArrayUBound As Integer
  Dim findCharArrayIndex As Integer
  
  theResult = -1
  
  If Len(pFind) > 0 Then
    findCharArray = pFind.CharacterSplit
    findCharArrayUBound = UBound(findCharArray)
    findCharArrayIndex = 0
    
    charIndex = 0
    For Each char As String In pString.Characters
      If charIndex >= pStart Then
        If char.Compare(findCharArray(findCharArrayIndex), pOptions, pLocale) = 0 Then
          If findCharArrayIndex = 0 Then
            theResult = charIndex
          End If
          
          findCharArrayIndex = findCharArrayIndex + 1
          
          If findCharArrayIndex > findCharArrayUBound Then
            Exit For
          End If
        Else
          theResult = -1
          findCharArrayIndex = 0
        End If
      End If
      
      charIndex = charIndex + 1
    Next
  Else
    theResult = pStart
  End If
  
  Return theResult
End Function
Public Function CharacterIndexOf(Extends pString As String, pFind As String, Optional pOptions As ComparisonOptions = ComparisonOptions.CaseInsensitive, Optional pLocale As Locale) As Integer
  Return pString.CharacterIndexOf(-1, pFind, pOptions, pLocale)
End Function
4 Likes

If Text has a feature that String does not and you’d like to see String support said feature, please submit a feature request.

4 Likes

Thanks for the heated response everyone. Whilst Text would “solve” the problem it’s deprecated and so that’s a no go for me.

I have submitted a feature request for a new method on String (CharacterCount).

In the meantime, I’m using an extension on the String class in a module:

Function CharacterCount(Extends s As String) As Integer
  Var count As Integer = 0
  For Each c As String In s.Characters
    count = count + 1
  Next c

  Return count
End Function

It’s definitely less than ideal as it’s really slow (which is not good when you’re trying to write a code editor). I’m sure it could be made faster if Xojo simply returned the iterator count from String.Characters directly from the framework.

7 Likes

I don’t get this why feature request would be needed. Len has always been supposed to return number of Characters? And LenB the number of Bytes. Is there something I am missing here ?

Sounds more to me like Bug report would be the correct thing.

3 Likes

It’s a technical difference since code points and characters are the same in most use cases. Where they are not, the String functions deal with code points, and that’s where the confusion arises.

My solution would be to de-deprecate Text for those who need proper text handling without worrying about the underlying storage issues, but that’s not getting much traction.

3 Likes

I see this is not limited to Xojo:

image

I actually knew that at one point and had forgotten. Thanks for the refresher.

It’s been a long time since I looked at this so I could be wrong but I have a feeling that the only way to determine the count is to iterate the characters.

That’s not to say that a native Xojo implementation couldn’t be faster as implementing it in C would have less overhead.

2 Likes

Add Java, Google Dart, PHP and I think, Objective C to that list.