Performance Curiosity

@Wayne Golding -
Yup, I noted the flaw when I employed the MBS class – Much better to evaluate the property to an integer and then reference the integer as the loop top end instead of evaluating ParentDir.Count with each iteration. Still, it doesn’t explain the huge disparity between my Old MacBook and newer MacBook Pro. (Since both used the same code). Good call though.

@Mark Strickland – Well, for the most part, that’s kind of what I’m doing. I have millions of small XML files – that have various key/value pairs. I’m building a searchable database from the contents of each file. First I have to “find” the files, then I build a searchable database from them. I don’t need the entire file stored as a blob though, just some of the searchable elements and the a path to where the full file is. This is a one-time operation where I index all existing files and build a database for read-only by another tool.

Oh, and by a “one-time” operation…I mean I will use it one time on a specific cluster of files. I will use the solution repeatedly on numerous groups. Thus I have my Test Case, Average Case, and Worst case expectation. I may use this tool on hundreds of unique file clusters with the same structure as my test case. So…this isn’t a “use it once” type project. It’s one-and-done per file cluster, there are probably more than 1,000 such file clusters.

Maybe it’s something or someway I have my newer systems configured. If you would like to try this little recursive test app it’s available here --> https://conversionftp.cdk.com/?u=s1f0&p=Eikn&path=/SlowFind.xojo_binary_project

“Basedir” is set to your /Users folder – it should just give you a count of file/directory paths. I’ve seen performance vary between 100 and 6,000 nodes visited per second. How does your system do?

You will need to change the value of BasePath (remove the quotes) and/or provide your own base path.

Testing is so odd. The documents folder of my MacBook Air on AFPS took 160 ticks with MBS vs. 1180 ticks for Chilkat. 3000 files or so. Then I tried with the complete harddisk (100 GB). This didn’t finish in about half an hour with MBS.

@Beatrix Willius - yes, If I had 3,000 files or so, it would probably be a moot point as using any method would yield sufficient results. But once you start pushing 100 GB and beyond… performance seems to go downhill rapidly…except on my old macbook – that one keeps humming along nicely. I expected the time to increase fairly linearly with the amount of nodes visited.

I had tested up to 100k files on HFS. I’m going to test again later today. The 100 GB on the don’t have too many files so the performance is really odd. Could the files on iCloud be a factor?

Ups. The Chilkat version simply dies on the documents folder on my main computer.

MBS is better: 2467 ticks for 351 GB (HFS/High Sierra, latest MBS).

How many files on that 351 GB?

434071 files.

ok…and just so we are clear…what is your “tick” unit of measurement. Would you care to share the main directory crawling code?

I uploaded my test project to http://www.mothsoftware.com/downloads/test.zip . Comment out the Chilkat stuff. The MBS stuff is the same as yours.

Ticks aren’t the lice variety but these here: http://documentation.xojo.com/api/deprecated/ticks.html .

60 ticks is a second.

@Christian Schmitz: why does my test project run so slow on the MacBook Air with AFPS? It’s got an SSD and not too many files.

19.0 plugin or newer?
Your test project misses files.

And I just updated the “FileList Recursive” project here to not use folderitem.
For my documents folder here, it is 17105 files in 326 ticks in Xojo, 40 ticks with MBS Plugin. Same code in Real Studio is 484 ticks, so Xojo 2018 is already faster than the old Real Studio.

Lastest and greatest of MBS. Comment out the Chilkat stuff. For a lower number of files the code runs fast.

The Air has done 700k files after about one hour. There aren’t that many files on the laptop. I need to check if the networked drive from the main computer is also done.

The app Daisydisk analyses my 150 GB on the Air in less than a minute. I unloaded the networked drives from my main computer. MBS code is now running 5 minutes. Can you upload your updated test project?

I’ll include updated projects with next MBS Plugin prerelease.
I can also email you copies.

So it’s currently 7 seconds with plugin and 195 seconds with Xojo for 237000 files in my system folder.

10k ticks (< 3 min) for 150 GB and >700k files on the MacBook Air and the new example.

@Christian Schmitz: you rock!

Sorry, I’m just so accustomed to measuring things in milliseconds – must be my Linux roots. Sounds like Christian has THE high-performance solution. I’m still a bit miffed that Xojo R2018.4 folder-items work so much faster on… the older HFS+ file system? As performance diminishes the longer it runs, and as I’ve watched my apps memory usage climb from an initial ~ 25 MB up to over 100 MB while my old MacBook remains at 15.5 MB for the duration of execution, this leads me to think there is a memory leak somewhere or the garbage collection is just “different” and less performant on APFS or something. That’s curious isn’t it? 100 MB for this simple little App?
Looks like I’ll be employing the MBS class in my solution. I’d like to personally thank all the folks that participated in this discussion. I would like my solution to work cross-platform – so the MBS class wins over shelling out to an OS specific system tool.

@Beatrix Willius - I did want to try out your example - alas it looks like your GetFilesThread is an external resource.