Speed it up?

Does anyone have a suggestion for speeding this up a ton?

It takes a given folder and adds to an array the path, size (in bytes, KB, MB, GB), sub folder counts, file counts. It recurses into subfolders. The array gets pushed in to a SQL database.

It works decently until I hit a subfolder with 10000 files, then it’s 5 to 10 minutes on that folder.

I’m guessing the bottleneck is creating a folderItem 10000 times? I’m betting an MBS plugin might help, but I can’t figure out which ?

Thanks, Bill



//ReadFolderItem(theFolder As FolderItem)


//Scans the selected folder and sub folders
//wrap all data in single quotes
//Adds folder stats data to sql

FolderCurrent = theFolder

//Exit is folder nil
If theFolder = Nil Or Not theFolder.Exists() Then
  Return 
End If

//Get the item count for this folder
var itemCount as Integer
itemCount = theFolder.Count

//Loop through the items in this folder
For i As Integer = 1 To itemCount
  
  //Get the current item i as f 
  var f As FolderItem
  f = theFolder.TrueItem(i)
  
  //See if f is a folder or file
  If f <> Nil Then
    If f.Directory Then
      //Add this folder stats to the array for the database row
      Var MyData() As String
      
      //Add this folder path while escaping single quotes
      MyData.Add("'" + f.NativePath.ReplaceAll("'","''") + "'")
      
      //Use plugin to get folder stats
      Var d as DirectorySizeMBS = f.CalculateDirectorySizeMBS(True,0)
      
      //Check that stats are available. Some folders are symlinks and will have nil details
      if d <> nil Then
        //Not a virtual dir, so we have stats
        MyData.Add("'" + d.PhysicalTotalSize.ToString + "'")
        MyData.Add("'" + Format(d.VisiblePhysicalTotalSize/1024,"0") + "'")
        MyData.Add("'" + Format(d.VisiblePhysicalTotalSize/1048576,"0") + "'")
        MyData.Add("'" + Format(d.VisiblePhysicalTotalSize/1073741824,"0") + "'")
        MyData.Add("'" + d.FolderCount.ToString + "'")
        MyData.Add("'" + d.FilesCount.ToString + "'")
        
      else
        //d was nil, so probably a symlink dir.  Supply nils
        MyData.Add("'" + "nil" + "'")
        MyData.Add("'" + "nil" + "'")
        MyData.Add("'" + "nil" + "'")
        MyData.Add("'" + "nil" + "'")
        MyData.Add("'" + "nil" + "'")
        MyData.Add("'" + "nil" + "'")
        
      end if
      
      //add to sql database
      InsertToDb(MyData)
      
      //Recursion into folder for children folder
      ReadFolderItems(f)
      
    Else
      //Must be a file. Ignore for this purpose
      
    End If
  End If
Next






You want FileListMBS. Don’t forget to set the yield parameter or you will get the beachball.

1 Like

Have you read the FolderItem documentation ? I remember some advice (and some “don’t do”)…

Read the “Performance considerations” of the FolderItem documentation (near the end of the page, before the codes examples.

For the outer loop, use for each instead. It’s much faster.

For each f as folderitem in theFolder.children

Another thing you could do is do the recursion first and use the “else” to calculate the size and counts ahead of time because if you have several levels deep, you’re counting the same items over and over.

3 Likes

Also, what is this function doing. If you are doing a whole series of inserts into a database you are far better off in creating a prepared statement and then using it repeatedly to insert your data, with your different datasets. For 10,000 ordinary SQL statements converting to a single prepared statement and using it 10,000 times you would gain a very significant speed boost.

Each time you run SQL like:

Var cSQL as String = "Insert into tablename (a, b, c, d, e, f) values (?, ?, ?, ?, ?, ?)"
For nFile as Integer = 1 to 10000
    // This line is working out how to perform this insert in 
    // the best possible way every single time it is used
    oDB.SQLExecute( cSQL, parameters )
Next

Do the work once:

Var cSQL as String = "Insert into tablename (a, b, c, d, e, f) values (?, ?, ?, ?, ?, ?)"
Var ps As PostgreSQLPreparedStatement // or some other DB type

// This line works out how to perform this insert 
// in the best possible way, once
ps = db.Prepare( cSQL )

For nFile as Integer = 1 to 10000
    // This line follows the prepared pattern each time
    ps.ExecuteSQL( parameters )
Next

I’m away from my computer to I’ve not been able to check the code, but the principal is sound.

1 Like

you could use a class/object with propertys

Var MyData() As MyDataClass

var data as new MyDataClass
data.PhysicalTotalSize =
data.FolderCount =
data.FilesCount =
MyData.Add(data)

if you use a remote SQL database use a transaction around your inserts.

2 Likes