@Thomas E You wrote that the md5 call does use only a few % CPU. This points to an I/O bottleneck. Therefore it does not make much sense to perform several checksums at once.
I just ran your tests and confirmed it's taking longer with concurrent runs. So the question is: where is the bottleneck? My test files were 220MB WAV files (a folder of 5 of them, all the exact same size). They're on a local drive. Took about 20 seconds per file running concurrently.
Incidentally, calling md5 via 'nice -20' actually makes them a bit slower.
So I moved them to our SAN, which is where we would be doing this work off of in real life. I was pretty surprised to see how much slower that is (36 seconds, vs 22 seconds off the local drive). The SAN is connected to this machine via a 10GbE network and is a true SAN (the drives appear as local volumes, no SMB). We regularly move files around on this machine at saturation on the 10GbE NIC. And this machine is one of the slower connections. most of the workstations are connected to it via a 40GbE NIC, and the SAN can easily move 1.5GB/s.
I'll have to test this on one of the MacPro Xeon boxes that are connected to the SAN via 40GbE to try to eliminate that bottleneck. the connection I have on this iMac is good enough for the work I do from it, but it's nothing like the performance on those machines.
It is worth noting that the python-based command line tool we've used in the past to do this same work lets you specify the number of concurrent processes. Past a certain number you start to see slowdowns, but in testing I've found that running 4-8 concurrent processes (depending on the machine you're on - I can do 8 on a Linux box with dual 14-core Xeons in it) is possible without slowdowns, and with a significant increase in overall processing speed. The only reason we're not using it is that it's buggy and sometimes refuses to run on certain volumes. nobody can figure out why, but my software works fine on those volumes where the command line python app can't.
*[I realized after posting this that I was running a large file copy in the background. I stopped that and revised my SAN speed numbers, and am now much less worried!)