Running Concurrent shells

I have an app that makes multiple simultaneous calls to a shell (asynchronous, mode 1), to take advantage of multi-core machines. The number of shells will change from machine to machine, so the way I I’m doing this is to let the user set a preference for the max number of processes (shells). I then keep count of them, incrementing and decrementing the counter as each thread is spawned and completed.

A Run button sends a list of files to be processed to a thread that delegates the work to other threads. It looks to see how many processes are running (by checking the counter) and if there’s a slot, it sends the next job to that thread. Each worker thread opens an asynchronous shell and sends a command. When it gets the results, the DataAvailable event is triggered, writing the results to the database.

When I test this with one the preference set to 1 shell, the performance is the same as when I test it with 3 shells. My counters, which are incremented and decremented from inside the worker threads when they’re invoked, are working properly. But when I look at the Activity Monitor on my mac, I only see one instance of the command being called, with only one open file at a time.

I thought that by using asynchronous shells, called from separate threads, I’d be able to run more than one command at a time, rather than sequentially. Is there something wrong with this setup?

You don’t actually need a thread for this.

Your description sounds fine though, and should work as you expect, so I suspect your code is not doing what you think it’s doing.

I want the threads so that I can update the GUI (progress bar, current file, total number of files processed, etc). One run can involve working with anywhere from a couple dozen to a couple hundred thousand files and data sets of many terabytes, so it can run for hours. To quote Robert Fripp: “Feedback is appreciated.”

The reason I’m using a thread for the thread controller, which does the delegating, is so that I can update the GUI. I suppose I could eliminate the sub-threads that it calls and just call the shells directly, but I did it separately so that they don’t lock up the delegator thread, because those have some while loops in them to pass the time while the external process is completing. The code looks like this:

threadOrganizer thread:

[code]//Make sure the processingDone flag is false
App.processingDone = False

//Some code to start a timer so we can track total processing time

//Start at the first record:
records.MoveFirst

while App.processingDone = false
If App.currentProcessCount <= App.maxProcessCount then
dim t as hashThread //this is a worker thread
t = new hashThread

//call the hashThread Constructor, pass it the file path and the db ID
t.Constructor( trim( records.Field("mediaFilePath") ) ,  records.Field("ID") )
t.Run

if records.EOF then 
  while App.currentProcessCount > 0
    //wait for spawned threads to complete
  wend
  App.processingDone = true
else 
  //move to the next record
  records.MoveNext
end if

end if
wend
[/code]

hashThread (a worker):

[code]//increment the process counter
App.currentProcessCount = App.currentProcessCount + 1

f = GetFolderItem( filePath, FolderItem.PathTypeNative )

if f <> nil then
//update the UI so we know the current file
App.currentFileProcessing = f.NativePath.ToText

if f.Directory = false then
sh = new ShellClassInstance
sh.Mode = 1 //make it async

Dim args as string
Dim cmd as String
Dim result as string

//run some OS-Specific tests to generate the command and args

sh.Constructor(cmd, args, dbID.val)

while sh.IsRunning
  //twiddle thumbs
wend    

end if

//update the total count of completed hashes
app.totalObjectsCompleted = App.totalObjectsCompleted + 1

//update the current process count to remove this process
app.currentProcessCount = app.currentProcessCount -1

me.Kill
end if[/code]

ShellClassInstance:

[code]//set some local properties
command = cmd
arguments = args
dbID = id

//run the method that executes the command
me.ExecuteCommand[/code]

ShellClassInstance.DataAvailable:

dim hash as string
If me.ErrorCode = 0 Then
  //some OS-specific code to parse the results into the string that's put into the db ('hash')
 Else
  dim errcode as string = str(me.ErrorCode)
  break
End If

//update the db with the hash results
App.dbUpdateRow( dbID ,hash)

…and at this point it we go back to hashThread.run, where the counter is decremented and that thread is killed, and then the next time threadOrganizer runs it sees there’s an open process and spawns another thread.

The while loop in hashThread.run is there because without it, I was getting about 10% of the actual results with .DataAvailable, and crashing in random locations when I put the same code in .Completed and tested that instead. My hunch was that the parent thread was gone by then, leaving it hanging. Since the process I’m running can take a second or two up to 20 minutes, I put the while… wend in there to wait for so that thread could finish cleanly.

I’m not seeing what’s wrong here.

Something seems to be amiss. I was testing earlier from a compiled build, but when I went back to the IDE to run it in the debugger, with no changes to the code, I started seeing concurrent md5 instances in the activity monitor. No idea why.

You don’t need a thread to update the GUI with number of files and progress. A multiple timer is all you need, with a 20 milliseconds period, which is about 1/60th second, equivalent to the number of frames in a TV display. Faster would be a waste of CPU.