Fastest Way To Get FolderItem Names Into Array?

So in my application I generate thumbnails of things to speed up display for later use. These thumbnails are stored in a directory structure off in SpecialFolder.Temporary.child(myAppName), where I create a new subfolder named with the UUID of the item. In my application, there can be many thousands of these items, so I may wind up with thousands of folders created in my temporary directory.

What is the fastest way to get the list of items in that temporary directory into an array of String()?

I’ve tried just looping through f.item(i), which is VERY slow. I’ve also used f.itemsMBS which is faster, but on an older windows laptop still takes 15+ seconds for 2800 items.

Should I be jumping out to a shell, running the dir command and just parsing the results?

Anyone have a faster hack?

Thanks!

What platform are you on? OR rather What plaform is causing the biggest slowdown?

I pulled code from the Windows Functionality Suit (WFS) years ago to speed up going through a directory and getting file name for display.

For Windows, I would use the FindFirstFile/FindNextFile declares.

[code]Function ItemNames(extends f as folderItem, ReturnFileNames as boolean = true, ReturnFolderNames as boolean = true) As String()
// Returns an array containing the names of child items
// within the given folder item. Returns files and/or
// folders, as directed by the input booleans.
//
// Returns an empty array if f doesn’t exist, or if f
// isn’t a directory, or if both input booleans are false.
//
// The iteration is not recursive. On Windows, the special
// directories “.” and “…” are ignored.

dim result() as string

if f.Directory then

#if TargetWin32
  
  // On Windows, RB's f.Item(i) is slow for folders with large
  // child item counts, so we use Declares on Windows instead.
  // Adapted from Aaron Ballman - see http://tinyurl.com/5susum
  
  Soft Declare Function FindFirstFileA Lib "Kernel32" (path as CString, data as Ptr) as Integer
  Soft Declare Function FindFirstFileW Lib "Kernel32" (path as WString, data as Ptr) as Integer
  Soft Declare Function FindNextFileA Lib "Kernel32" (handle as Integer, data as Ptr) as Boolean
  Soft Declare Function FindNextFileW Lib "Kernel32" (handle as Integer, data as Ptr) as Boolean
  Declare Sub FindClose Lib "Kernel32" (handle as Integer)
  
  dim UnicodeIsAvailable as boolean = System.IsFunctionAvailable("FindFirstFileW", "Kernel32")
  
  dim ChildData as MemoryBlock // WIN32_FIND_DATA struct
  dim ChildHandle as integer
  
  if UnicodeIsAvailable then
    
    ChildData = new MemoryBlock(592)
    ChildHandle = FindFirstFileW(f.AbsolutePath + "*.*", ChildData)
    
  else
    
    ChildData = new MemoryBlock(318)
    ChildHandle = FindFirstFileA(f.AbsolutePath + "*.*", ChildData)
    
  end if
  
  if ChildHandle <> -1 then
    
    dim ChildAttrs as UInt32 // first 4 bytes of WIN32_FIND_DATA
    dim ChildName as string
    
    // Loop through remaining items in the folder.
    
    dim FoundNextChild as Boolean
    
    do // loop through remaining children
      
      ChildAttrs = ChildData.UInt32Value(0)
      const NameOffset = 44
      
      if UnicodeIsAvailable then
        
        ChildName = ChildData.WString(NameOffset)
        FoundNextChild = FindNextFileW(ChildHandle, ChildData)
        
      else
        
        ChildName = ChildData.CString(NameOffset)
        FoundNextChild = FindNextFileA(ChildHandle, ChildData)
        
      end if
      
      // Now that we have its name and attributes, we can decide
      // whether this child should be added to our return array.
      
      dim ChildIsFolder as boolean = (ChildAttrs and UInt32(16)) <> 0
      
      if (ReturnFileNames and not ChildIsFolder) or _
        (ReturnFolderNames and ChildIsFolder) then
        
        if childName <> "." and childName <> ".." then
          result.Append(ChildName)
        end if
        
      end if
      
    loop until not FoundNextChild // should really test GetLastError for ERROR_NO_MORE_FILES
    
    FindClose ChildHandle
    
  end if
  
#else
  
  // On non-Windows systems, pure RB code seems pretty fast.
  
  dim child as FolderItem, ub as Integer = F.Count
  
  for i as integer = 1 to ub
    
    child = f.TrueItem(i)
    
    if (ReturnFileNames and not Child.Directory) or _
      (ReturnFolderNames and Child.Directory) then
      result.Append child.Name
    end if
    
  next i
  
#endif

end if

return result

End Function[/code]

On OS X, the FolderItem iteration code is fairly well optimized if you:

  • store the FolderItem’s Count in a local variable
  • iterate linearly from the beginning
  • don’t recurse into any subdirectories while iterating (if you need to do this, build up a work list and recurse later)

Using a dictionary is roughly 5 times faster than IndexOf in a simple test.

@Karen Atkocius - a HUGE thank you for sharing that method. On windows, this is several orders of magnitude faster than iterating over FolderItem.item() - in fact, raw iterating took 46920364.7555404 microseconds (about 47 seconds), while your method only takes 11604.9673061371 microseconds (about 1/100th of a second) over the same dataset.

That’s a whopping speedup.

Thanks!

Very interesting, I’ll have to pull this into my code and see what happens. Thanks Karen.

You are welcome…

But I did not write that. I just pulled it out of the WFS!

Over the years I’ve run into several cases where I needed to iterate through folders with a lot of files in Windows so I was very motivated to find a solution long ago.

  • Karen

PS: Because of the poor performance on Windows in iterating through a folder, I think it might be a good idea for such a method to be part of the framework.

I use the above method when I iterate through large directories on the Mac as well.

Anytime there is such an obvious platform specific bottlneck, where possible, it would be nice for the framework to offer X-Platform methods to solve the problem, so that we can just use the same code regardless of platform.

Karen, I have posted this to Xippets for use of other members

I once wrote FileListMBS class to list quickly.
The best speed actually can be achieved if you avoid folderitems.

[quote=111513:@Christian Schmitz]I once wrote FileListMBS class to list quickly.
The best speed actually can be achieved if you avoid folderitems.[/quote]

FileListMBS works very well. I started using it recently and works extremely well.

I concur.

currently use FileListMBS on projects where scanning a few million files can be a small job. Last scan I ran was for 827,014 files which took 70394 ticks.

:slight_smile:

I just pasted Karen’s routine into a test app of mine and the debugger is telling me “that the extends modifier cannot be used on a class method.” What am I missing here?

I’m running 2014r2 on 64-bit Window 7.

  • Dale

Put the method into a module instead of adding it to a window.

FileListMBS++

Same here.

Nope. Tried that and its existence isn’t recognized. Whether I qualify the name or not, I get “this item does not exist.” I made sure it is global in scope. If I create a second, non-extends type method next to it, the non-extends one gets globally recognized.

Add a new Module named “Globals” to your project (not to a window)
In that module, add the ItemNames function
Works like a charm.

That’s all it takes, really.