IndexOf with UTF-8

If I perform a search with the indexof function (casesensitive = true) and the text contains characters such as üèö… I get an incorrect value back. I understand that the problem is with the UTF8 encoding but I don’t know how to fix it.

If I set casesensitive = false it works fine.

Please post your code, the text you are searching, what you are searching for, what you expect, and what you got.


The code

Var MyText as string = "Test öö IndexOf "
Var Position as integer = MyText.IndexOf("Index",ComparisonOptions.Caseinsensitive)
'Return 8

Var MyText as string = "Test öö IndexOf "
Var Position as integer = MyText.IndexOf("Index",ComparisonOptions.Casesensitive)
'Return 10

Bug for sure - report it. I experimented with converting it to UTF-32 and UTF-16 and it completely fails:

Var MyText as string = "Test öö IndexOf "


Var Position as integer = MyText.IndexOf("Index",ComparisonOptions.Casesensitive)
//Position = -1



Var MyText As String = "Test öö IndexOf "
Var Position As Integer = MyText.IndexOf("Index",ComparisonOptions.Casesensitive, locale.Raw)
'Return 10

it should return 8 too. Info from Issue #65969 (note from Paul)


Confirmed here.

Works great for UTF-8 but fails (position = -1) with UTF-16 and throws a RunTimeException with UTF-32.

That’s with 2020 r1.2.

This is the kind of bug that gives the language a bad reputation for being riddled with framework bugs. This is totally a unit testable bug.

It’s returning the byte index instead of the character index. You see that if you search for “ö” instead. (It returns 5.)

Sign onto the Issue here:

I think this is a similar issue : #70437
Same kind of problem : #69674 which has been fixed.

Yeah - seems like there are plenty of issues with this method. Hopefully this will get fixed sooner than later.

Targeted to 2023r4 (Milestone).