Reading a Folder contents under m1

Emile_Schwarz · September 9, 2021, 9:09am

This is a follow-up from another discussion: new case discovered.

We all know (now) that the items reading order from mass storage is no more alphabetized.

I discovered yesterday, and confirmed today, that this is not true for previous hard disks.

I was working on a folder that resides in the internal (boot) m1 disk. Then I had to work with -that folder on an i5 laptop, so I moved the folder on an external HD (formatted on that i5 / El Capitan).

When I tested the project back with the m1 laptop, the project worked fine. I was suspicious and made a copy of the folder inside the m1 boot hard disk, run the project and… the items load data is not alphabetized…

Conclusion:
The reading order depends on how the mass storage (hard disk) was formated (file format: APFS or older). On APFS: you have to alphabetize the items read order if you need that.

Arnaud_N · September 9, 2021, 9:30am

Yes, this is already fairly well known. Most filesystems return sorted files; APFS is the only one I know (used these days) which breaks this. And I find it nonsense.

Emile_Schwarz · September 9, 2021, 11:57am

But the unsorted files comes in the same order.

I tried to save files in a newly created folder one at a time, and in my order (alphabetical), and I get them in its own order.

The surprise comes when I loaded files from an HFS+ formatted hard disk: they were alphabetically sorted !

It will stay as nonsense until we get an explanation (eventually). Until then, this is quite stupid.

The work-around is easy (even if I had to write it two times as my project read a lot of Folders, then inside each of these folders. I forgot to apply the work-aound to one of them.

Arnaud_N · September 9, 2021, 12:35pm

In the same order than what? Other accesses? Write time?

“Own order” as in “they were shuffled”? Weird…

Didn’t you say you saved them in “your order” (alphabetically) in the first place? I’d expect them to stay sorted alphabetically later, especially on HFS+.

I tend to agree. But life teaches us things can appear nonsense or stupid only until we know a good reason why it was done that way. On the other hand, sometimes there’s just no reason but just laziness or lack of interest…
Hard to know when things are just stupid or we’re just unaware, in this world…

There are several of them, like sorting with an array of name (AllItemsName.SortWith AllItems).

Granted. Sometimes we just think “Well, I’ll use this only once in my project, so no need to make a global module or a whole method just for that”. Later, we eventually need to make a similar method, only to realise our first method is mixed with other functions and we’d have to rewrite it separately.
Long-term vision may decrease this if you know all you want to do since the beginning; that’s not always (and not frequent for some)…

Emile_Schwarz · September 10, 2021, 7:07am

The unsorted order:
a. the order it comes originally when the master folder was in the M1 Boot disk.
b. That folder was moved to an external HD… formatted years ago with a i5 laptop on El Capitan
c. Then I moved back in the M1 Boot disk.

a and c returns the files in the same order (or disorder if it pleased you): the last 3 magazines issues were 5, 17, 18.

b: the files are returned alphabetically (1 thru 18 in order).

Yes, 18 issues.

In short, the non order return files is burned inside APFS.
YOU CAN GET FILES RETURNED ALPHABETICALLY ON M1 if the hard disk was NOT formatted on an M1 Computer (using APFS; I do not tested other file systems).

Jeff_Tullin · September 10, 2021, 7:15am

We all know (now) that the items reading order from mass storage is no more alphabetized.

We have not been able to rely on the order of files in a folder on a Mac for several years.
It is no more predictable than running Select * from table in a database.

if there was a select condition we could apply to the folderitem in Xojo, Xojo could ‘protect’ you from that, and even provide more functionality such as delivering files in an order such as
Alphabetical
Reverse alpha
By date
By time

But creating a method of your own to do the same is possible, albeit at an additional cost in performance.

Arnaud_N · September 10, 2021, 7:24am

It would be nice if there was some kind of selector to list items in a folder the way one wants, but I can see why file systems are not designed that way.

Emile_Schwarz · September 10, 2021, 9:40am

Can you explain why APFS is designed that way ?

Emile_Schwarz · September 10, 2021, 9:51am

Me excluded. Before APFS, I always get the files in alphabetic order (since somewhere in the MacOS 6 life).

In the MacOS 6 life, it was possible to sort the order of the files (alphabetized, I was able to do that, never think to try a time sort for example):
display the folder as name (from a to z),
select all files
move the files with the Option Key down (if my memory is correct) onto its owning folder icon… et voilà.

a. At worst, I think the “natural” order is the time order of “copied in the folder”. [I tried that]
b. Another one can be the items creation date.
Both of the above would makes sense (the file system have nothing to do in a; it have to do a sort in b, but if so, why don’t doing an alphabetize ?)

Usually, I would say that the real problem (trouble) is the lack of documentation. But we do not have documentation since… ages.

I remember the “User’s Guide” and “MacOS User’s Guide”; even the “ Technical” books (and I havemany of them from 1985 thu 1994, then…
(I skipped the MacOS Release Notes)

Jeff_Tullin · September 10, 2021, 10:28am

Me excluded. Before APFS, I always get the files in alphabetic order

Maybe on your machines.
As I say, it unpredictable. You cannot rely on your app , used on another machine ,returning files in alphabetical order.

ChristopheDV · September 10, 2021, 10:35am

As Jeff already told, on macOS you cannot rely on the order. Doesn’t matter which file system.

You will need to load all files in a list and order it yourself. MBS has a class that does this very fast (milliseconds - even for very large amounts of files).

Emile_Schwarz · September 10, 2021, 11:50am

On all machines I run until 2014/High Sierra, I get the files alphabetically. Even using AppleScript…

I do not know why.

Christophe:
Since I noticed that (in the previous months, 2 or 3, can’t recall), I used to fill an array, and after excluding the items I do not want (sometimes files, sometimes some folders by name, always invisibles files), sort the array and use .TrueItem(Items_Arr(Idx)) to process the choosed items.

And about the process time… it is not really a problem for me: small amount of data to check. I may add a wait window when the project is stable enough if the process time goes more than one second (and hundreds files to deal with).

ChristopheDV · September 10, 2021, 12:29pm

Pure luck.

Just iterating through a folder of files with FolderItem is very slow (read: ridiculous slow) . Depending on the amount of files it can takes several seconds (minutes). With the MBS class (I don’t remember which) it takes <1sec for thousands of files.

ChristopheDV · September 10, 2021, 12:30pm

You should avoid this - because you can with the MBS class.

Edit: FileListMBS is what you need.

Arnaud_N · September 10, 2021, 2:53pm

Explain why APFS doesn’t return alphabetically-sorted items, I presume?
No, I can’t. I’d like to know.

I bet it’s either a limitation of the whole design or they just didn’t care for that, the former sounding more logical.

Emile_Schwarz · September 13, 2021, 9:52am

This is the result of the project using a comic book edited in France.

The data archive is (data generated with the project):

I place three screen shot at the end for better understanding.

The project takes around 3 seconds to process the data and display the resulting html in TextArea. This is the maximum size for a folder to be processed (to date).

Conditions of the run:
i5 laptop from 2014
around 36Mbps external hard disk *
macOS High Sierra
EyeTV software running (DVB TV)

Of course, connecting an SSD (without an USB 2 hub that slow down the data access time) will speed up the process.
Using an Apple m1 laptop with the master data stored in its fast internal SSD will also speed-up the process.

In the future, I will make an intensive search to accelerate the code process time.

I just used the project, the part that extract data from a master folder, from the IDE.

The folder holds 26.8GB of data stored in 452 folders (for a total of 18,808 items [files and folders])

To be able to understand here’s a simple explanation about what the project do:

From a dropped folder (master), the project…

Open each fist child of that master folder
Search an archive file and read its size
Search a folder that ends with a ‘-’ (no space before or after)
Read each file name in hat folder
and process it to make a book summary (check the provided archive contents)
Create two tables:
a. a table of the missing issues (with magenta background)
b. a table of the scanned issues (with alternate background clors: blue and yellow)
The method store the data as html in a TextArea.

The user have two choices (beside clearing the result):

a. Display the html data in another window that have an HTMLViewer

b. Save the html data (TextArea contents) into a… html file

I use two loops to alphabetize the items to be read from the hard disk;
I use two loops to scan the data folder; one to process each folder n the master foldr, the second one process the scanned files names to build the table of contents.
Each magazine scanned pages file names use a file name that holds the page number and a simple descrption of what’s in the scanned page or nothing; look at the resulting html to beter understand. You know what a Table of Contents is.

Nota: the project is able to resize and save the magazines covers from a folder into an “images” folder used by the generated html file.

Also:
If the project find a “Notes.txt” file, it load its contents (as html) and place its contents below the scan image (look at the archive).

The hub I used (I get t wth the Apple m1 laptop) seems to be an old gat that slow down everything below 40 MBps (checked with a ad hoc software). Some vendor “good” work (!).

Data about each scan page resides in the file name. Below is page 1 for issue 451:
Mandrake 451 - 01 - Couverture par Mario Caria (Mandrake).jpg
The sorfware will report that as:

Couverture par Mario Caria (Mandrake)

This is a part of the master folder. Each folder holds data about one issue (first 21 issues displayed).

Contents of issue 451; notice the archive and the default MacOS icon (whre the scans of the issue resides).

Isn’t my icon nice ?

Emile_Schwarz · September 14, 2021, 10:31am

5 downloads…

I moved a copy of the project and of the base data (data to process) into the m1 internal SSD (fast ssd) and run the project.

The first run displayed each issue twice (!) and took more than 30 seconds (instead of 3 seconds in a 2014 i5 laptop).
I quit Xojo, renamed the data base folder, shut down the laptop.
Reboot it without the external HD, fired Xojo 2021r2.1, run and it tooks a while too (more than 20 seconds).
The generated html file was correct (and half the size of the previously saved file).

Oh, I changed nothing in the project I created and written in Xojo 2015r1. (no API2 involved here IMHO).