String.Characters: How to determine the last character?

With the help of the String.Characters iterator it is possible to output all single characters correctly, even if they consist of composite characters (some emojis, Gujarati etc.) and thus String.Length > 1 is. This is already very helpful, but also already the tricky part, because how can I now find out when the iterator has reached its end?

What do I want to do? I need to read String.Length of the last letter of a String. I can add the length of all previous characters within the loop so that I have the correct value for String.Middle to read the last character. That’s not the point, I just need to know when the last pass of the loop is reached.

That’s how I do it at the moment. Any other suggestion?

Var sentence As String = "Hello World - હેલો વર્લ્ડ 👨🏻‍🦰"
Var index, length As UInteger

length = sentence.Length

For Each character As String In sentence.Characters

  index = index + character.Length

  If index = length Then

    ' Last character.

  End if

Next

Haven’t you seen https://documentation.xojo.com/api/data_types/string.html#string-rightBytes ?

And its relatives (in “See Also”)

Of course, but even for that I need to know the exact length of the last character beforehand. And with dynamic user input, I can never know this beforehand. Therefore I have to iterate through all characters using String.Characters.

My only concern in this thread is whether there is also an alternative way to mine from above that works well.

You could use a traditional For… loop to iterate over the characters by index:

Var sentence As String = "Hello World - હેલો વર્લ્ડ 👨🏻‍🦰"
Var length As Integer = sentence.Length - 1
var currentCharacter as String

For index as Integer = 0 to length
  currentCharacter = sentence.Middle(index, 1)
  If index = length Then
    ' Last character.
  End if
Next

Thank you for your reply @Anthony_G_Cyphers. Unfortunately I couldn’t use your way because your code doesn’t consider composite characters. Try this and look at the logs :wink: This is the reason why String.Characters is so important. It looks like this is really the only possible way, unless Kem comes up with RegEx now :smiley:

Var sentence As String = "Hello World - હેલો વર્લ્ડ 👨🏻‍🦰"
Var length As Integer = sentence.Length
Var currentCharacter As String

For index As Integer = 0 To length
  currentCharacter = sentence.Middle(index, 1)
  System.DebugLog(currentCharacter)
  If index = length Then
    ' Last character.
  End If
Next

Yeah, that’s not great. Even using sentence.ToArray("") fails in this regard.

I wouldn’t say it “fails” because the String methods are all byte based, which is very good.

Except there are specific methods for getting byte values, like string.MiddleBytes.

This is true, however, changing the Xojo framework from String.Middle/Left/Right to respect composed characters would break code. This was discussed at length in the feedback case for the introduction of String.Characters at the time, so I can live with it even if it amounts to a bit of extra work.

I’m not sure if this work 100% of the time, but this pattern will find a letter followed (maybe) by a non-enclosing mark at the end of the source.

\pL\p{Mn}?\Z

If you are looking for more than just letters, it would have to adapted.

And there is always this:

var lastChar as string
for each char as string in source.Characters
  lastChar = char
next
1 Like

Why do you need to know this?

…because there will be situations where e.g. a text cursor in a custom control in combination with the backspace key deletes the previous character :wink:

Surely the String.Length of any character is 1.

No, that’s not true. Try my code from above and you’ll see there will be characters and emojis with a length > 1.

xojo is now 0 based

For index As Integer = 0 To length

=

For index As Integer = 0 To length-1
If index = length-1 Then

last char should be
System.DebugLog sentence.Right(1)
you could compare it in for each to know that you are at the end.

also insteresting

Nope, as I wrote above (you’ll get the last of the composite characters of the Emoji):

on closer inspection i agree :slight_smile:

1 Like
Public Function StringToCompoundArray(s As String) As String()
  Var result() As String
  For Each char As String In s.Characters
    result.Add(char)
  Next
  Return result
End Function

// Test function

Var sentence As String = "End હેલો વર્લ્ડ 👨🏻‍🦰"

Var compound() As String = StringToCompoundArray(sentence)

Var msg As String

For i As Integer = 0 to compound.LastIndex
  msg = msg + "["+compound(i)+"] length:"+compound(i).Length.ToString+EndOfLine
Next

// compound.LastIndex is where the last compound char is
// Now you have access to any of its parts and lengths

MessageBox msg
1 Like

if we copy the emoji into source code and select the last char then
its temoprary displayed as 2 characters but length is 4.

Const emo As String = "👨🏻‍🦰"

System.DebugLog emo.Length.ToString '<- return 4

Var b As Boolean = emo.Right(4) = emo '<- return true

System.DebugLog b.ToString

Yeah, it’s because Xojo’s Code Editor doesn’t take care of composited characters. You can see it, if you try to select the emoji with the keyboard.