Accentuated characters in name of FolderItem Catalina

Hi,
I’m under Mojave and I don’t have the issue, but some users under Catalina reported me problem which seem to be problem with Accentuated characters.
I’ve filled a bug report #60309 but as there isn’t answer, I would to know if I’m the only one to have the problem with its applications.
I sync 2 folders, FolderA and FolderB.
(my Method is more complicated but to explain the problem I do it simplest).
I first loop inside the two folders and I have two arrays with the name of the items contained by each folder, ArrayFoldA and ArrayFoldB .
Then, I loop each item of ArrayFoldA and I do ArrayFoldB.IndexOf(ArrayFoldA(ThisIndex)).
If result is -1 it is suppose to not be present in FolderB then I copy it. First, I do a check to verify it is ok and it is not, it exists as FolderB.Child(ArrayFoldyA(ThisIndex)).exists return True

It happens with accentuated characters like “Clia Mara”. It seems it depend of the way the is encoded. With the key key (then only one digit) or with the key and then with the key e to obtain (2 digits).

a) The Feedback case is missing an example.

b) You should have been on on Catalina since 11 months.

c) Your explanation here is kind of rambling and the Feedback case is worse. You have a folderitem with the name “Clia Mara”. An .exists for the folderitem fails. Is this correct?

d) Your problem likely comes from improper normalization of your strings. Which you say yourself. So why don’t you just normalize your strings?

Ups. You are correct. The normalization is different.

Create a folder with the name “smörebröd” on the desktop. Run the following code:

[code]dim f as FolderItem = SpecialFolder.Desktop

for currentitem as Integer = 0 to f.Count - 1
dim d as FolderItem = f.ChildAt(currentitem)
msgbox d.NativePath
if d.Name = “smörebröd” then msgbox “found”
next[/code]

There will be no messagebox with “found”.

Thank you for the answer.
I didn’t create an example as I can not verify if it show the problem as I’m not under Catalina.

I read a lot of things about Catalina and I prefer to stay under Mojave.

Yes, the folder named “Clia Mara” (or “smrebrd” ) exists in the two folders A and B. But if I check it using IndexOf from an array of the name it does not find it. But if I check it using .Exists then it exists.
Then .exists works and .IndexOf doesn’t .
and it seems, reading your exemple, that if I write my own indexOf method looping into each item of the Array it won’t work neither as comparing the 2 string (FolderItem.Name and the string itself) return False too.

With your exemple, if I will have :
ArrayFoldB.AddRow(f.ChildAt(0).name) ’ f is a folder containing only one folderitem, a folder named “smrebrd
ArrayFoldB.Indexof(“smrebrd”) return -1 and ArrayFoldB(0) = “smrebrd” return False.

To answer your last question, I don’t normalize string (and I don’t know what it is to normalize a string).

Thank you very much to complete my post and my FeedBack. It’s true I wasn’t clear but I’m French and my English is not good enough. Report a bug clearly is even not easy in its own language.

I become crazy, I create a folder “Without accent” added to your folder “smörebröd” and I modify your small program :

dim f as FolderItem = SpecialFolder.Desktop
Dim MyArray(-1) as String

for currentitem as Integer = 0 to f.Count - 1
  dim d as FolderItem = f.ChildAt(currentitem)
  msgbox d.NativePath
  MyArray.AddRow(d.Name)
  if d.Name = "smörebröd" then msgbox "found 'smörebröd'"
  if d.Name = "Without accent" then msgbox "found 'Without accent'"
next

MessageBox "IndexOf 'smörebröd' = " + str(MyArray.IndexOf("smörebröd")) + EndOfLine + "IndexOf 'Without accent' = " + str(MyArray.IndexOf("Without accent"))

I don’t have the MessageBox with “smörebröd” but I have the MessageBox with Without accent".
Indexof(“smörebröd”) return -1 and indexOf(“Without accent”) return 3.

Then I have the problem under Mojave !!! Not in my program !???

I sync my Desktop folder with another folder with my Sync program and all works fine. I suppose Mojave does something with the string that Catalina doesn’t (or the opposite).

Maybe a hint: I had some similar problems a while ago and it was due to the unicode which was different.
You can have the same accented letter « ö » with two different unicode even though they are displayed the same.

I don’t have the details anymore…

I change the program and I did some test. I know Encodings.MacRoman may lost some characters. But I don’t know about UTFxxLE/BE as I don’t have problem with those encodings.

dim f as FolderItem = SpecialFolder.Desktop
Dim TheText, TpTextA, TpTextB as String
Dim MyArray(-1) as String
Dim MyEncod as TextEncoding

If false Then ' Those enconding work
  MyEncod = Encodings.UTF32LE ' Encodings.UTF32BE ' Encodings.UTF16BE ' Encodings.UTF16LE ' Encodings.MacRoman
Else ' Those don't
  MyEncod = Encodings.UTF32 ' Encodings.UTF8 ' Encodings.UTF16
End If
TpTextA = "smörebröd"
TpTextB = "Without accent"
TpTextA = TpTextA.ConvertEncoding(MyEncod)
TpTextB = TpTextB.ConvertEncoding(MyEncod)
for currentitem as Integer = 0 to f.Count - 1
  dim d as FolderItem = f.ChildAt(currentitem)
  TheText = d.Name.ConvertEncoding(MyEncod)
  MyArray.AddRow(TheText)
  if TheText = TpTextA then
    MessageBox d.NativePath + EndOfLine + "found '" + TpTextA + "'"
  Elseif TheText = TpTextB then
    MessageBox d.NativePath + EndOfLine + "found '" + TpTextB + "'"
  else
    MessageBox d.NativePath + EndOfLine + "nothing found"
  end if
next

MessageBox "IndexOf '" + TpTextA + "' = " + str(MyArray.IndexOf(TpTextA)) + EndOfLine + "IndexOf '" + TpTextB + "' = " + str(MyArray.IndexOf(TpTextB))

Two things that might be involved when it come to macOS FileSystem and Encodings:

  • HFS vs. APFS
  • XojoVersion pre/until 2019r1.1 vs. 2019r2+
    You should try any combination to check when/why things behave differently.

APFS is a pain with filenames containing Umlauts. It (might) depend on the API being used, which completely changed with 2019r2 on BuildTarget macOS.

Have you tried using string.compare to compare the name?
https://documentation.xojo.com/api/data_types/string.html#string-compare

I modify the test code.
TestEncodFolderItemName

Effectively .StringCompare does work, it return 0 where TheText = TpTextA return False.
Then a workarround could be to rewrite .MyIndexOf using StringCompare , but it will be much more longer :frowning: .
My synchronization software is recursive and manage big folder, then many many files.

Note : I noticed that if I convert in Encodings.UTF32 then StringCompare bug.

Can you not just do something similar to below? I just typed it into the forum, so I haven’t tested it.

[code]Dim folderBItem as folderitem
Dim filesToCopy( -1 ) as string

Dim n as integer = folderA.count -1
For l as integer = 1 to n
folderBItem = folderB.child( folderA.item( l ).name )
if folderBItem.exists = false then filesToCopy.append folderA.item( l ).name
Next

if ubound( missingFileNames ) > -1 then
MsgBox “The following files need updating” + endOfLine + endOfLine + join( filesToCopy, endOfline )
end if
[/code]

Hi Thomas,

unless your Mac does not support Catalina, you should at least install it in a virtual machine, to be able to run tests.

It seems Mac users have a very high adoption rate for any new system. They will be expecting your program to support it.

Have a look at String comparison works on Windows but not on OSX.
There is a declare to normalize strings. I assume it does help to solve the issue.

Sam, yes your method should work as there is no encoding problem with.Child(MyString).
Michel, my MacBook is old and I read to bad things about Catalina. And I’m a hobbits and do not need many people donate for my applications. And I think we shouldn’t deep in Apple evolution. I hate Apple, their evolutions brings more problem for users than benefit. Only to be a register Apple developper I should pay 100$/year ! Only to be listed in Apple ! And I pay 300$/year to develop with Xojo, use this forum, have support, have a Wonderfull development tool. Waoouwww, something I don’t agree. But it’s another subject.
Thomas, I did look the linked forum.

I changed my program and I fill my array with :
MyArray = MyFolder.ChildAt(ThisIndex).Name.ConvertEncoding(Encodings.UTF32LE)
and I just received 2 emails of the 2 persons who reported bug that now everything is ok.

But, I think MyStringA = MyStringB should return true is the difference is only encoding difference between “” and “e”.
And IndexOf should work too.
If we want to differentiate that, we should use bytes, .IndexOfBytes or something like that.
I think we will have headache with that, when reading file name but also when someone fill a TextField when he paste a string, it accentuated encoding may be difference. I will do some test with TextFiels.

Unfortunately it doesn’t. Therefore you could use a normalize call to solve this.

I have 2 TextFields/

Dim MyTextA, MyTextB as String MyTextFieldA.Value = "smörébröd" MyTextFieldB.Value = f.ChildAt(1).name ' This file is a folder named "smörébröd" MyTextA = MyTextFieldA.Value ' .ConvertEncoding(MyEncod) MyTextB = MyTextFieldB.Value ' .ConvertEncoding(MyEncod) MessageBox str(MyTextA.Bytes) + " - " + str(MyTextB.Bytes) + endofline + Cstr(MyTextA = MyTextB)
I obtain “12 - 15” and False.
If in MyTextFieldB I delete the first ö and enter it again then MyTextB.Bytes = 14, if I do it for the second ö -> 13 and if I replace the é then I have MyTextB.Bytes = 12 and in this case (MyTextA = MyTextB) return True.

If I do

MyTextA = MyTextFieldA.Value.ConvertEncoding(MyEncod) MyTextB = MyTextFieldB.Value.ConvertEncoding(MyEncod)
Then MyTextA.Bytes) + " - " + str(MyTextB.Bytes return 36 - 48.
But (MyTextA = MyTextB) return True.
If I replace the two ö and the é then MyTextB.Bytes = 36 and, of course, (MyTextA = MyTextB) is still True.

Note : I just saw that .IndexOfBytes is already an instruction which exists in Xojo.

Look at this example. Put the code in a Textarea open event:

[code]Soft Declare Function CFStringCreateMutableCopy Lib “Foundation” (alloc As Ptr, maxLength As UInt32, theString As CFStringRef) As CFStringRef

Dim compositeMB As New MemoryBlock(5)
compositeMB.Byte(0) = &h62
compositeMB.Byte(1) = &h6c
compositeMB.Byte(2) = &hc3
compositeMB.Byte(3) = &hb6
compositeMB.Byte(4) = &h64

Dim decompositeMB As New MemoryBlock(6)
decompositeMB.Byte(0) = &h62
decompositeMB.Byte(1) = &h6c
decompositeMB.Byte(2) = &h6f
decompositeMB.Byte(3) = &hcc
decompositeMB.Byte(4) = &h88
decompositeMB.Byte(5) = &h64

Dim cS As String = compositeMB.StringValue(0, 5)
Dim dS As String = decompositeMB.StringValue(0, 6)
cS = cS.DefineEncoding(Encodings.UTF8)
dS = dS.DefineEncoding(Encodings.UTF8)
Me.AddText "composite: " + cS + EndOfLine
Me.AddText "decomposite: " + dS + EndOfLine

If cS = dS Then
Me.AddText “EQUALS” + EndOfLine
Else
Me.AddText “DIFFERS” + EndOfLine
End

Dim mutableStringRef As CFStringRef = CFStringCreateMutableCopy(Nil, 0, dS)

Soft Declare Sub CFStringNormalize Lib “Foundation” (theString As CFStringRef, theForm As UInt32)

CFStringNormalize mutableStringRef, 2
dS = mutableStringRef
Me.AddText “After normalization” + EndOfLine

If cS = dS Then
Me.AddText “EQUALS” + EndOfLine
Else
Me.AddText “DIFFERS” + EndOfLine
End
[/code]

:smiley:

Thomas #2 (as I am #1 :slight_smile: ), I wrote a Method from your exemple :

[code]Public Function NormStrgEncod(CeTexte as String, CeForm as UInt32 = 2) as String

#IF TargetMacOS Then
If not(CeTexte.Encoding = Nil) Then
Soft Declare Function CFStringCreateMutableCopy Lib “Foundation” (alloc as Ptr, maxLength as UInt32, TheString as CFStringRef) as CFStringRef
Soft Declare Sub CFStringNormalize Lib “Foundation” (TheString as CFStringRef, TheForm as UInt32)

  Dim mutableStringRef as CFStringRef = CFStringCreateMutableCopy(Nil, 0, CeTexte) ' Inutile : CeTexte.ConvertEncoding(Encodings.UTF8))
  
  CFStringNormalize(mutableStringRef, CeForm)
  ' CFStringNormalize mutableStringRef, 2 ' Dans un exemple c'tait 2, dans l'autre il passait  CeForm  en paramtre
  CeTexte = mutableStringRef
  
End If

#EndIf

Return CeTexte

End Function[/code]
I wanted to use Extends but Xojo tell me I can’t with Declare. Not a problem.
Then I do when I read a FolderItem name :

MyName = NormStrgEncod(f.ChildAt(MyIndex).Name)

And it works, thank you. As I don’t understand all what I wrote in the Method, could you tell me if it’s ok like that ?
I tried to send CeTexte after doing CeTexte = CeTexte.ConcertEnconding(Encodings.UTF32LE) but your method doesn’t care, event if I don’t convert it back to UTF8 . Then it’s not necessary to do it in your method ?

My previous workarround was to ConvertEnconding(Encodings.UTF32LE) , but a guy report me it cause an error when copy file !!!???
I don’t understand as I do :
MyArray.AddRow(f.ChildAt(MyIndex).name.ConvertEnconding(Encodings.UTF32LE))
but I don’t do anything on the FolderItem name itself. That’s difficult with guy who report bug, sometimes the problem is somewhere else.

YOU ARE Thomas #1 as your Method works, I AM #2 :smiley: .