Advice on fastest load and save of data

I’ve got an app that records and plays back MIDI data. I’m currently using an array of classes with properties as follows:

aSong class:
name, record time as strings
numEvents, playChannel as Uint32
3 booleans
an array of mEvent classes (may have thousands of elements)

mEvent class:
  mData as memoryBlock (rarely more than 2 or 3 bytes)
  mStamp as Uint32

I’ve optimized loading and saving as much as possible, building/parsing a memoryBlock in memory using pointers. Big files with thousands of songs can take 5-20 seconds to load and save. I’d love to be able to do this in a few seconds.

Access on the fly needs to be very fast so that events are recorded and played back on time. This works great with the array of classes I’m using, but file access takes a while when there are thousands of songs with 5-10 thousand events.

I’m wondering if I could load/save faster if using Dictionaries to store data. Would I be able to access the data arrays as quickly during recording and playback? Thanks for any suggestions.

Can you supply a sample project, and/or post some of your code here in the forum? We might be able to see some areas that are inefficient.

Here is the code to save a file: (some of the variables are declared elsewhere)

' create memoryblock of file to save
Var mstream As New MemoryBlock(300000000) '300MB of memory to build our file
mstream.LittleEndian = False
mstream.StringValue(0, 4) = FileMagic
mstream.Byte(4) = 6 ' file version

mstream.UInt32Value(offset) = num ' number of songs in this file
offset = 9

For i = 0 To numSongs -1 ' for each song
  mstream.StringValue(offset, 4) = SongMagic
  offset = offset + 4
  
  mstream.BooleanValue(offset) = mySongs(i).expanded
  offset = offset + 1
  
'make a Pstring
  mstream.StringValue(offset, mySongs(i).name.Length + 1) = chr(mySongs(i).name.Length) + mySongs(i).name
  offset = offset + 1 + mySongs(i).name.Length
  
  mstream.UInt32Value(offset) = mySongs(i).numEvents
  offset = offset + 4
  
  mstream.BooleanValue(offset) = mySongs(i).parent
  offset = offset + 1
  
  mstream.UInt32Value(offset) = mySongs(i).playChannel
  offset = offset + 4
  
  mstream.StringValue(offset, mySongs(i).time.Length + 1) = chr(mySongs(i).time.Length) + mySongs(i).time
  offset = offset + 1 + mySongs(i).time.Length
  
  ' now save array of mEvent class
  For n = 0 To UBound(mySongs(i).mEvents) 'for each MIDI event
    
    mstream.UInt16Value(offset) = mySongs(i).mEvents(n).mData.Size
    offset = offset + 2
    
    mstream.StringValue(offset, mySongs(i).mEvents(n).mData.Size) = mySongs(i).mEvents(n).mData
    offset = offset + mySongs(i).mEvents(n).mData.Size
    
    mstream.UInt32Value(offset) = mySongs(i).mEvents(n).mStamp
    offset = offset + 4
  Next n
  
  mySongs(i).isSaved = True
Next i
mstream.StringValue(offset, 4) = EOFMagic
offset = offset + 4

Dim stream As BinaryStream

Try
  stream = BinaryStream.Create(saveOut,True) ' overwrite file, handle this!!
Catch exc As IOException
  If exc.ErrorNumber = 104 Then
    MsgBox "This file is in use by another application.  You need to close it first."
  Else
    MsgBox "Error trying to create file."
  End
  Return False
End Try

stream.Write(mstream.LeftB(offset))
stream.Close

That code looks OK to me. The main thing is that you have loops within loops, and Xojo is slow in that case, as it’s spending a lot of time looking for thread yields.

Please add
#pragma BackgroundTasks False
to your code, and you should see an immediate 2x to 10x speedup.

1 Like

You could also speed things up by using local variables to avoid repeated object dereferencing, e.g. change this:

’ now save array of mEvent class
For n = 0 To UBound(mySongs(i).mEvents) 'for each MIDI event
  
  mstream.UInt16Value(offset) = mySongs(i).mEvents(n).mData.Size
  offset = offset + 2
  
  mstream.StringValue(offset, mySongs(i).mEvents(n).mData.Size) = mySongs(i).mEvents(n).mData
  offset = offset + mySongs(i).mEvents(n).mData.Size
  
  mstream.UInt32Value(offset) = mySongs(i).mEvents(n).mStamp
  offset = offset + 4
Next n

to this:

’ now save array of mEvent class
 var eventArray as xxx = mySongs(i).mEvents
 var nEvents as integer = UBound(eventArray)
For n = 0 To nEvents 'for each MIDI event
  var mData as xxx = eventArray(n).mData
  mstream.UInt16Value(offset) = mData.Size
  offset = offset + 2
  mstream.StringValue(offset, mData.Size) = mData
  offset = offset + mData.Size
  
  mstream.UInt32Value(offset) = eventArray(n).mStamp
  offset = offset + 4
Next n

Should be faster, and also easier to read, in my opinion.

This may not produce a measurable improvement, but instead of copying the leftmost bytes to a string, try shortening the MemoryBlock:

 mStream.Size = offset
 stream.Write(mStream)

Other ideas:

  • running this in the IDE/debugger will be 5x to 20x slower than a compiled app. Be sure to test actual speed in a compiled app.
  • I can’t tell from your code, but are you repeatedly opening then closing a single stream? If you don’t need to do that, perhaps keep the stream open, do all the writing to it, then close it.
  • I agree with Andrew that
    stream.Write(mstream.LeftB(offset))
    could be slow. You might also try this instead:
    stream.Write(mstream.StringValue(0,offset-1))

I’m not sure how fast Xojo is to alter the size of a 300MB memoryBlock - that might be fast, or slow? You should benchmark Andrew’s suggestion too.

One more idea:

  • you could use Pointers instead of Memoryblocks, so your code would look like this:
// create memoryblock of file to save
Var mstream As New MemoryBlock(300000000) '300MB of memory to build our file
mstream.LittleEndian = False
// get a Pointer to the memoryblock, for ultimate speed
var p as Ptr = mstream 

p.CString(0) = FileMagic
p.Byte(4) = 6 ’ file version
[... etc ...]

I don’t think this would gain you much, but it might be worth a try to squeeze out every last bit of performance.

Wow, thanks for all the advice. I was thinking I should avoiding repeated object referencing, but hadn’t tried it.

I didn’t know about “BackgroundTasks False”. Also, I haven’t tested the compiled app since I did my other optimizing.

As for the idea of using Dictionaries instead of Classes, would that be a viable option?

I’ll try some of these ideas. Thanks again.

Check out the other #Pragmas on that page, these might also help (once you have your code fully debugged):

#Pragma BoundsChecking False
#Pragma NilObjectChecking False
#Pragma StackOverflowChecking False

In my experience, Dictionaries would likely be much slower than Memoryblocks.

Shortening one is basically free.

1 Like

As a long term plan, you might need to consider lazy-loading objects. It’d be a major change, but loading everything at once will always have a finite limit. You’re at 5-20 seconds for thousands of objects now. If you ever cross into tens or even hundreds of thousands, even a 50% reducing in loading time would still be a long time.

You might consider keeping the objects in a database table, but that comes with the logging files which may or may not be an issue for your app. You can do it with binary too by keeping track of the offsets you’ve already loaded. I’d consider the binary approach more complicated, but that would be your decision. But I think it’s something you’ll want to make long term plans for.

2 Likes

If you’re building as 64-bit, I also suggest that you try building using the Aggressive compiler setting. It’ll make the built binary larger, but has the potential to make operations like this much faster.

2 Likes

Thom, I like that idea. Once a song has been recorded, its data doesn’t change. I could easily convert that data to a file-ready memory block and then the saving of the blocks would be almost instantaneous. I hadn’t thought of that. The block could even be a property of the song and I could bypass the loop inside the loop. When loading a file, I could wait until I want to play the song to parse the events (or do it in idle time).

Or maybe I could stop using events as a class and keep them in a serial block with a progressive index, eliminating the loop inside the loop.

Thanks.

Greg, thanks for that suggestion. I didn’t know about that.

This is by far the best suggestion so far: do not parse data you will not use. Leave the MIDI events in storage until the user does something that requires them, and then parse them (be sure to retain the parsed results in memory in case the user comes back to that song!). Doing this will eliminate a huge proportion of the effort required.

1 Like

I tried that at the beginning of the routine and returned to True at the end. Went from 27 to 26 seconds on a large file. Looks like only 4% improvement. Maybe I did something wrong.

Not necessarily. All that does is suspend the other threads within that particular method. If your method is already efficient and consuming 100% CPU, you won’t see much improvement.

1 Like

The more relevant pragmas are going to be BoundsChecking and NilObjectChecking, since you are accessing an array.

That’s odd, it’s usually a much larger speedup, especially with nested loops.

  • what version of Xojo are you using? I seem to remember a bug long ago where BackgroundTasks False would fail, and you had to to use
    #pragma DisableBackgroundTasks
  • are you testing in the IDE or in a built app?
  • There is no need to set BackgroundTasks to True at the end of a method (it’s always True at the start of every method).

For example, this code:

#Pragma BackgroundTasks False
dim t0 as double = system.Microseconds
for i as integer = 1 to 1e6
  dim y as integer = 42
  for j as integer = 1 to 20
    dim x as integer = y mod 37 * i * j
  next
next
dim dt as double = system.Microseconds - t0
MessageBox "That took " + str(dt/1000, "#") + " milliseconds"

In a built app, runs about 3.6x as fast with BackgroundTasks False (200msec vs 750msec)
In the IDE it’s about 10x slower, and using BackgroundTasks False is only 16% faster (3200 vs 3800msec)

1 Like

His code is doubt a LOT of work inside the loop so it is likely dominating the time used - disabling background tasks is a pretty small slice of his particular pie. :pie:

1 Like