Fastest File Finding...

Matthew_Combatti · December 18, 2015, 6:16pm

The fastest method on Windows that I’ve found to count all files in a directory and sub-directories is to use the command line. Is there a fast method using native code to do the same thing (looping through Folderitems.Item() is slow)? Invoking the command-line commands to run the search yields “Access is denied.” Have even attempted writing the commands to a batch file, running from a shell, and attempting to parse the ECHO. But, echo does not work in a Xojo Shell as it does in the Windows command line.

The Command to get all files and sub-files:

dir /b /a-D *.* /s 2> nul | find "" /v /c > tmp && set /p count=<tmp && del tmp && echo %count%

It can process 250,000 files in less than 1 second. So far, every native Xojo method I’ve tried takes 45+ seconds. In the application which is being developed, that is unacceptable.

Any suggestions/methods on how to get the command line command to work with a Xojo shell?

Matthew_Combatti · December 18, 2015, 6:17pm

I’ve tried using API declares, and those are faster than Xojo, but still slow.

Travis_Hill · December 18, 2015, 6:23pm

Perhaps echo/output the count directly to a temp file instead, and read it from there in your app.

Matthew_Combatti · December 18, 2015, 6:27pm

AHA:

dir /b /a-D . /s 2> nul | find “” /v /c > tmp && set /p count=<tmp && del tmp && echo %count%>count.txt

and reading from the file did the trick. Knew it had to be something simple. Pressed for time today with the holiday approaching and brain-farts are surely imminent Thanks Travis! Happy holidays to you!

Andrew_Lambert · December 18, 2015, 7:57pm

This example from the old forum is probably about as fast as you can possibly get.

Marco_Puppo · December 19, 2015, 11:54am

The example of the old forum works good only with 32 bits versions of windows.

With 64bits windows works only “sometimes”, always the first call not always subsequent calls.

With 64bits versions of windows you need to add other dedicated declares.

FindFirstFileEx function

It’s better and faster to use a shell redirecting the output to a text file.

Emile_Schwarz · December 19, 2015, 1:02pm

I have that in a project: the first folder drop works always, the second quit the target application (OS X, 2015r1)

Marco_Puppo · December 19, 2015, 1:26pm

The problem is with Microsoft Windows.

With Mac Os X you have to try with the code samples that you can find in the help system for FolderItem.

Can you post your part of code ? We can try to find and fix the problem.

Matthew_Combatti · December 19, 2015, 2:04pm

Thanks but I’ve resorted to using the command line only. I had used the Find(*) API’s and they were slower than the command-line once recursion was implemented. Once getting the list structure, you then still have to loop through each item and invoke API to determine if it’s a file or a directory, then browse into it, list all items, and repeat for each item…basically rewriting Xojo’s very own FolderItem class. Even invoking API, it is still being invoked at the speed of the Xojo framework which is still faster than the built-in class, but extremely slow (unacceptable for the number of files being scanned) in comparison to the command-line commands that invoke applications entirely written in C, by the manufacturer of the operating system itself, to do the count. At this point, the Windows version is still faster than the Mac and Linux versions (using the command-line 0.02 seconds on Windows for the same directory that Mac takes 0.93 seconds to scan. These numbers may seem small, but the actual directory it will be scanning has nearly 10,000 times more files and directories than the test one), which I’ll be investigating to seek a faster Mac and Linux methodology as well. But thank you for your help guys. I’ll post my cross-platform solution once everything has been investigated.

Marco_Puppo · December 19, 2015, 2:14pm

Thank you.

Garth_Hjelte · December 19, 2015, 6:18pm

This is pretty interesting. We are talking simply getting a single number? The number of files - not including folders, so we aren’t talking about “objects” - in a specified folder, including all sub-folders? Which means FolderItem.Count isn’t what we want? And we aren’t talking about any properties of the file, just that it exists?

I find this interesting, but of what use is it? I know we as programmers want to know all sorts of things for different reasons, but why is this useful in the practical sense, and what IF you are off by a couple objects?

For me, I can see this being practical if I want to know in advance how serious a subsequent call to enumerate those objects would be. Like if it returned something huge like 250,000, maybe I’d want to alert the user that “it’s gonna be awhile!”

Michel_Bujardet · December 19, 2015, 6:54pm

What I find useful here is how much faster a simple shell can be as compared to even the system API. Sometimes we tend to go for complex, when the solution has been there all along.

Alexander_van_der_Linden · December 20, 2015, 7:56pm

In my opinion, ‘dir’ and any of its parameters does NOT count/tally/index files and folders, instead it makes a call to the File Allocation Table and obtains it info there.