APFS and Xojo FolderItem issues - May be very serious issue !

@ Joe

Did you try using FolderItem ? That’s what isn’t working.

Heres my best bet:
'APFS can’t be used on startup disks or Apple’s Fusion Drives, and filenames are case-sensitive only.

The IDE uses it quite a bit.

As someone who has worked on the Xojo (Realbasic) source for FolderItem a while ago (whew, that’s already 9 years back!) and also having had related issues in the past with my tool Find Any File, I suspect this is an old “bug” in the FolderItem code, and it might be that I even mentioned this to JoeR at one time in the past.

First, some general background on this issue:

This revolves around Unicode normalization. If you search the Web for current discussions on this related to iOS 10.3 and APFS, you’ll see a lot of talk about this being a potential issue.

That problem is about the fact that some Unicode chars such as “ü” can be encoded in two ways, being either a plain “ü” code or a code that combines “u” and “¨”. Both variants are valid, the first one is called a composed (or composited) char, the latter a decomposed (decomposited) char.

Now, HFS+ does convert all names it received into the decomposed format before looking up or creating a file name on disk, so that does usually not matter how you form the names in your program.

APFS (and this is for the case-sensitive format that’s the only one available until the recent 10.12.4 added a case-insensitive version) does not do this conversion. It lets you create two files both named “ü” on your APFS volume, and they both look the same to you. You can’t tell the difference unless you look at the name with a hex viewer to see which Unicode format was used.

It also means that if you have a single file named “ü”, using the composed format, and you try to access the file by passing the same name in decomposed format, you’ll get a “file not found”.

How this may related to Xojo and FolderItem

First, I’ve run into this problem myself with Xojo, even on HFS volumes, when I search for files matching user-entered names:

As “Find Any File” (FAF) lets you enter a file name to search for, it may get the name in composed format. When my code then reads the directory, getting all the names, these comes from the HFS driver, which will always give me decomposed names as explained above. Now, when want to compare whether the user-entered name matches the name on disk that I got from FolderItem.Name, StrComp as well as “=” may not match them up even though they should. That’s because the string comparison functions Xojo uses are not smart enough to consider composition differences.

I’ve made a small test program for you all to try:

[code]// See http://unicode.org/reports/tr15/

// Construct three representations of the letter Å (angstrom)
dim mb1 as new MemoryBlock (8)
mb1.UInt16Value(0) = &h212B
dim mb2 as new MemoryBlock (8)
mb2.UInt16Value(0) = &h0041
mb2.UInt16Value(2) = &h030A
dim mb3 as new MemoryBlock (8)
mb3.UInt16Value(0) = &h00C5
dim s1 as String = mb1.StringValue(0,2).DefineEncoding(Encodings.UTF16)
dim s2 as String = mb2.StringValue(0,4).DefineEncoding(Encodings.UTF16)
dim s3 as String = mb3.StringValue(0,2).DefineEncoding(Encodings.UTF16)

// Now let’s see how Xojo compares them using various comparison methods

dim eq1_2 as Boolean = s1 = s2
dim eq1_3 as Boolean = s1 = s3
dim eq2_3 as Boolean = s2 = s3

dim cmp1_2 as Integer = StrComp (s1, s2, 0)
dim cmp1_3 as Integer = StrComp (s1, s3, 0)
dim cmp2_3 as Integer = StrComp (s2, s3, 0)

dim icmp1_2 as Integer = StrComp (s1, s2, 1)
dim icmp1_3 as Integer = StrComp (s1, s3, 1)
dim icmp2_3 as Integer = StrComp (s2, s3, 1)

break[/code]

Ideally, at least StrComp(…, 1) should see that they’re all equal, because that’s how this function is supposed to work - take out all those fine language differences and then match them up. But it turns out that NONE of these comparison methods see these chars are equal.

And, sadly, Xojo does not appear to provide an built-in function to fix this (i.e. there’s no way to “normalize” the strings into a common decomposition format).

But with this (Mac specific) declare you can solve this:

declare function decomposedStringWithCanonicalMapping lib "Cocoa" _ selector "decomposedStringWithCanonicalMapping" (s as CFStringRef) as CFStringRef s1 = decomposedStringWithCanonicalMapping(s1) ...

Note that this will only get strings in some unified form, but it’s not exactly the form that HFS+ uses! (see http://stackoverflow.com/a/14227711/43615 )

The same problem may occur on Windows and Linux as well, BTW.

How APFS affects this

What Christoph reports, though, is even more problematic: He says that FolderItem gets confused itself, returning nil while fetching the original folder contents. I have not tested this myself, but I wonder if this is what’s going on:

From my own work in the past on the FolderItem code I believe it used to mix different Apple APIs, older and newer ones. I also know that JoeR worked in the years past my work on it on cleaning up that code, and avoiding using the older APIs, modernizing it all.

Now, I wonder if there’s still some code that uses a different API, and that other API converts the names into a format that the others do not use. While this would not affect HFS+ volumes, where the file system code takes care of any such ambiguities, it would lead to the problem I explained at the top if case-sensitive APFS were used. This would explain the behavior that Christoph sees: That, for instance, FolderItem.Name could return a name converted to decomposed format whereas the name on-disk is using the composited form. Now, if you’d use the Child function to look up the file with the name that you just got from FolderItem.Name, you’d get nil back.

In fact, I just replied to another, now related, question about “:” vs. “/” in file names on the Mac: https://forum.xojo.com/39668-getfolderitem-volume-i-name-error-replaced-by

There, I explained that FolderItem.Name returns “/” in names because it’s still using the older file system API that use the old naming rules (where “:” is a path delimiter and thus not allowed in file names). Now, what if those older APIs do perform their own form conversion and that leads to this issue?

So, I agree that someone at Xojo needs to take this seriously and at least look into this issue more closely.

Claiming that APFS is still in beta and may change its behavior is NOT a valid argument here because:

  1. Case-sensitive AFPS is now officially released (in iOS 10.3) and therefore very likely not to gain any more “fixes” in that area.
  2. Apple has clearly documented that behavior with the normalization issues that several iOS programmers ran into since. Apple has even updated their docs on this topic just a few days ago to clarify that they’re leaving it that way, on purpose: https://developer.apple.com/library/content/documentation/FileManagement/Conceptual/APFS_Guide/FAQ/FAQ.html

Also note that the case-insensitive APFS is probably not going to have these issues as it’ll be “normalization-insensitive”, like HFS+ is now. But for case-sensitive APFS, Xojo may need to get some fixing. Also note that this is different from case-sensitive HFS+, which does normalize names.

Oh, while I got excited writing that last answer, I totally forgot to consider this possibilty:

It could be that Christoph is doing it all wrong himself. Since he has not provided any test code as far as I can tell, it might be that his code does mess up the names accidentally, changing the composition form between reading a file name and later looking it up again, thereby not finding the original name any more.

In that case, the FolderItem code would be innocent, but other Xojo functions that he uses and that change the composition form might be to blame.

Christoph, have you considered this?

StrComp in binary mode will of course NOT consider encoding - it compares bytes
StrComp in lexical mode has been documented for some time as not considering encoding (since at least REALbasic version 5.5 as its in the printed manuals I have)
Hence you should use ConvertEncoding beforehand to make sure two strings ARE the same encoding

TEXT comparisons are encoding sensitive

[quote=324032:@Thomas Tempelmann]Ideally, at least StrComp(…, 1) should see that they’re all equal, because that’s how this function is supposed to work
[/quote]
No it’s not
Its been documented as ignoring encodings for quite some time
Since at least REALbasic version 5.5 as its in the printed manuals I have on hand

Norman, do you really not understand the difference between encodings and normalization? Despite me just having written a very extensive explanation about it, right here? If you’d just look at the code examples I gave, you’d realize that they DO use the SAME encoding.

I wasn’t commenting on normalization
Just your claim that StrComp is “not smart enough to …” and that is supposed to consider encodings
It’s not designed to consider encodings and hasn’t been for a long time and has been documented as ignoring encodings for > 10 years
You’re complaining that StrComp doesnt consider encodings when we already tell you it doesn’t

And there are a number of things that strcomp does not do well (thats pretty well known)
Handling different normalized forms is definitely one

Hence “text”

		dim t1, t2, t3 as text
		t1 = &u212b
		t2 = &u0041 + &u030a
		t3 = &u00c5
		
		dim teq1_2 as Boolean = t1 = t2
		dim teq1_3 as Boolean = t1 = t3
		dim teq2_3 as Boolean = t2 = t3

and this gives you the correct results t1 = t2 , t1 = t3 and t2 = t3

I only briefly did some simple tests with FolderItem and it always returned nil. I did not looked into this in more detail - I was hoping Xojo Inc would do some testing.
The odd bit is that Joe wrote he has no issues running the Xojo IDE on a APFS formatted volume. Here all apps, I knew were made big Xojo, did not work anymore. Didn’t test FAF though :slight_smile:

The customer who reported my apps are not running anymore, reverted back to HFS+ and it all works again.
IMO there is definitely not working right.
In the meantime I also got rid of the APFS volume so cannot do more tests. But if needed, I can format it again.

I made helping functions long ago for users of MBS Plugin to convert between normalization.

http://www.monkeybreadsoftware.net/string-stringcompose-method.shtml

Seems to me the proper way to go is to file a bug report with project attached that demonstrates the issue.

And tell customer to not use something that is not even in BETA yet

https://developer.apple.com/library/content/documentation/FileManagement/Conceptual/APFS_Guide/Introduction/Introduction.html
A Developer Preview of Apple File System is available in macOS Sierra. means “there’s possibly lots of stuff that doesn’t work yet”

Or even just messing up case sensitivity.

[quote=324058:@Christoph De Vocht]The customer who reported my apps are not running anymore, reverted back to HFS+ and it all works again.
IMO there is definitely not working right.
In the meantime I also got rid of the APFS volume so cannot do more tests. But if needed, I can format it again.[/quote]

A reproducible bug report would be great.

There are definitely a few things that need to be looked into before APFS ships.

Unicode is tricky - and even if your code works on macOS it may not work on Win32. Here’s an example gotcha that I found (on Win32) in which folderItem.URLpath works in one situation but not another: https://forum.xojo.com/37623-file-urls-with-unicode-using-folderitem-urlpath

This is exactly what Walt Mossberg said when the iPad 1.0 shipped, minus the cute drawings bit… But basically Apple have a long way to go if they want to replace a 'puter with an iPad. 7 Years later and most of what he asked for still doesn’t exist.[quote=323122:@Jeff Tullin]Heres my best bet:
'APFS can’t be used on startup disks or Apple’s Fusion Drives, and filenames are case-sensitive only. '[/quote]
This is what I’ve been reading as to why other developers are having trouble with it.

Thats the best advice at the moment, some other developers have reported that the macOS itself doesn’t run correctly on APFS, personally I’ve not tested it myself and probably will avoid until it until I have too.

[quote=324104:@Sam Rowlands]@Jeff Tullin Heres my best bet:
'APFS can’t be used on startup disks or Apple’s Fusion Drives, and filenames are case-sensitive only. ’
This is what I’ve been reading as to why other developers are having trouble with it.[/quote]

With macOS 10.3.4 you can also have APFS with non-case-sensitivity. Apple is clearly preparing this for macOS 10.4 coming in autumn - they already using it for iOS now.

Which is good, but the OP’s customer will certainly have had case-sensitive version - its not certain that was the issue, but its a good guess.

I recall having some very fraught conversations with one of my customers a few years back, which was due to them using the existing case sensitive file system.

And to be honest, thats not true.
It was due to me not being consistent is the way I referred to folders and files and extensions.
So in one part of my code I might type SupportFiles and in another supportfiles
File extensions … in one part of my code I might look for .thing and in another .Thing
So files would be mysteriously not visible to the app.
On case-insensitive systems (nearly all of them), I got away with it.

Took a few days to fix MY code - I couldn’t ask the customer to reformat their drive just because I was a lazy typist., even if they were the only customer who ever reported it.
Its also pretty essential if you are developing for Linux.

Perhaps next time I reformat my drive I should go for case sensitive, just to be on the safe side.

The underlying Unix is case sensitive, is it not ?

OS X is case sensitive since version 1.

And so what ?