FolderItem.Length caching bug

Consider a non-empty file (in f), and this code of mine in a larger app:

dim content as String = BinaryStream.Open(f).Read(f.Length)

vs. this code:

dim bs as BinaryStream = BinaryStream.Open(f)
dim content as String = bs.Read(f.Length)

The first should work and did work in older Xojo and RealStudio versions.

But since Xojo 2019r2 the first returns an empty string. No exception, no error set in f.LastErrorCode. The second just silently fails, which is terrible.

I eventually narrowed this down and found what’s causing this:

  • The FolderItem object f is created at a time when the file exists but is empty, so its Length property is zero.
  • Then the file is written to with non-Xojo functions (in my case, by using CFWriteStream). This won’t automatically update f.Length.
  • Now I try to read from the file, but Xojo’s code appears to take a shortcut and, instead of trying to read from the file (which would return the data because the macOS file system knows that the file is not empty any more), it assumes that the file is still empty and returns an empty string.
  • I don’t understand, however, why unchaining the Open and Read calls avoids this issue.

This is quite bad, IMO. And in Xojo 2019r1.1 and all earlier versions this was not causing any trouble. So, someone at Xojo made the concious decision to take this shortcut.

To work around the issue, I need to add a line that force-refreshes f, like this:

f = new FolderItem (f)
1 Like

I’ve always used bs.length and never had an issue, but I do see there is an issue that should be reported.

1 Like

could it be that LLVM will evaluate f.length in the first case before calling Open(f) and in second case wait for it, so it happens a bit later and Open(f) may trigger the cache invalidation?

Are you sure the CFWriteStream function isn’t sort of asynchronous and causes the data to be written after you check the length?

In no way. That’s a Xojo bug if confirmed. If we say A = B()+C(), C() should never be evaluated first and then A() as same precedence follows left to right. Even worse a chained operation, like A = A().FuncB() where funcB() needs the returned object from A() to call its FuncB() method.

Not sure but Xojo, erroneously, could be doing something like processing everything ok, but releasing the returned string right away, dismissed on-the-fly, not raising some error, but resulting in a nil/empty string, maybe?

Is this the case on all OSes? On which is this known to no longer work?

Like @Wayne_Golding, I use bs.Length.

Use bs.length. My guess is that it’s a function and so it only returns the length at the time you call it.

Typically, I do this:

Dim bs as BinaryStream

bs=BinaryStream.Open(WhichFile,True) 'See if there’s anything in the file
FileChunkStr=bs.read(bs.length,Nil) 'Read the whole file into a string

Yes. I also close the stream on time.
And the best proof is that the same code, on the same Mac, runs find with older Xojo versions.

Yes, I reproduced this on both High Sierra and Monterey, on different Macs. The only factor that changed the behavior was the version of Xojo I built it with.

Just out of curiosity, do a

f = New FolderItem(f) // Does it refresh the object touched externally?
dim content as String = BinaryStream.Open(f).Read(f.Length)

Work correctly?

Yes, it does. I thought I had written that above.

And it’s not a timing issue - I had other code run in between, such as read other files, which made no difference. So it’s more likely a caching issue with the Length function. It should not cache the value at all but instead, whenever it’s called, fetch it again from the file system. And it probably did before the change in 2019r2.

Well, new folderItem(f) starts with no cached properties, so that, when calling Length, it’ll have to fetch it from the file system. But FolderItem has the tendency to cache the values once it has fetched them, for performance.
I had already advocated better control over caching many many years ago but that idea never caught on.

So, instead, one has to assume that whenever a file’s on-disk properties change, one needs to refresh a FolderItem by using the method f = New FolderItem(f), or one may keep getting out-of-date values. This has been like this even before 2019r2 (and is therefore predictable), but something else has changed that causes the weird (and unpredictable!) behavior in my initial post.

At this point, it’s basically a “good to know” thing. Xojo will probably not see this as a bug, so why bother. Just be aware that whenever you want to check if a file’s on-disk properties have changed, you need to create a new FolderItem in order to get those current values.

Agreed. And few others too. We could have a way to disable/enable such caching for this object and have an extra FolderItem.Refresh() method to manually control it when caching is on.

1 Like

if there is an internal method to invalidate cache, we should know it and call it, e.g. when a plugin or declare does something to the folderitem. anyone knows the trick?

It should be exposed as a .Refresh(), I guess it deserves a Feature Request

Rick, that feature request has been made more than once in the 20 years past, I’m sure. Good luck :slight_smile:

Also, I may have been wrong about f = new FolderItem(f) being the work-around to refresh a folderitem object’s properties. I just did a Shell command that would change the contents of a dir:

  dim src as FolderItem = GetFolderItem("traversables")
  dim dst as FolderItem = GetFolderItem("traversables-test")
  dim sh as new Shell
  sh.Execute "/bin/rm -Rf "+dst.ShellPath
  sh.Execute "/bin/cp -Rp "+src.ShellPath+" "+dst.ShellPath

After this, dst.Exists was false and dst.Count was zero. So I added this line:

  dst = new FolderItem(f)
  if dst.Count = 0 then break

But made no difference.

Only this did:

  dst = GetFolderItem("traversables-test")

This worries me.

1 Like

You meant

  dst = new FolderItem(dst)  // Try to refresh
  if dst.Count = 0 then break

?

Yes, it was hopefully a typing error and not some nonsense I had in my code.

1 Like