Loading large files locks up main application window

Can someone help me understand why my application hangs (gives the message on Windows) “Not responding” when I load/read-in a large text file (500 MB). I have place the read-in method for text files within its own thread, so as to free up the main thread for the user to continue to use while the new file is being loaded. However, at the time of reading in the text file, the main thread becomes unresponsive. I notice that the main thread locks up as soon as the code (below) is run.

ts=file.openasTextFile
alltextfile=ts.ReadAll
ts.Close

Once the text file has been split into an array of smaller lines of text, the main thread will become responsive again. I really wonder if there is a cleaner way to read-in this large amount of data? In addition, why does the read-in thread seem to lockup the main thread? Any help with this issue would be appreciated. Thanks!

First, you should change your code to:

ts = TextInputStream.Open( file )

I haven’t tested but it’s entirely possible that the TextInputStream is not Thread-friendly. Remember, Xojo Threads are cooperative, not preemptive, so native functions have to grant time to allow other threads, including the main thread, to proceed.

But if you are going to Split the text anyway, why not process the file one line at a time in a loop in the first place? That will easily solve the problem. Using something like this…

ts = TextInputStream.Open( file )
while not ts.EOF
  dim thisLine as string = ts.ReadLine( theEncoding )
  // Do something with the line
wend
ts.Close

Damn, Kem. You beat me to it.

But, in the loop of the thread, the OP may want to include App.YieldToNextThread() to prevent the app from locking. That is if the text file is to be loaded in a thread.

Not needed. It will yield on the loop anyway.

Will it? I’ve always done something like this:

While condition // Do something App.YieldToNextThread() Wend

Do I not need to?

Well, I never.

I just ran a quick test, and it turns out I do not need to use YieldToNextThread!!!

Learn something new every day!!

The difference is, without that call, Xojo will decide how often to yield based on thread priority and other factors to which we are not privy. With that call, it will yield on every cycle of the loop which may make the rest of your app more responsive but may not be particularly efficient.

I would propose that you read the data in chunks rather than lines. Reading in lines can be slow. You may need to find the right chunksize to maintain speed whilst also allowing the OS to ‘check’ the application.

Dim binaryTextData as new memoryblock( theFile.length) Dim position as Uint64 const chunkSize = 2097152 ts = TextInputStream.Open( file ) while not ts.EOF binaryTextData.stringValue( position, chunkSize ) = ts.read( chunksize ) position = position + chunksize wend ts.Close return binaryTextData.stringValue( 0, binaryTextData.size)

I haven’t tested this code.

Hi Sam,

That is a great idea, and it works too! Thank you so much for that reply, i can now read-in these large text files. I had previous tried to read-in the whole file to a memory block in one shot (mb.stringValue(0,mb.size)=ts.ReadAll) and that did not work. Never thought to breakup the memory block read-in into chunks.

The problem with (dim thisLine as string = ts.ReadLine( theEncoding ), as suggested above) is that even that line of code will not work with the text files that I am opening. For example it is a 500 MB file and some lines alone are over 250 MB. As such, thisLine = ts.ReadLine, will readin this large line, but as soon as I try and parse that line with something like split, the program crashes. As expected, there is a limit to how large a string can be within XoJo, and what one can do with that string. I have run into that limit, and now I need to change how I read-in these files. I even tried breaking these large lines in half, and then half again, but it was a mess and split would still initiate a program crash

Keeping the text file stored within a memory block allows me to parse the text file into much smaller strings, and eventually process the whole file.

Thanks for all replies. I tried them all and appreciate the effort.

Yes
As big as there is available memory in a 32 bit process

[quote=276669:@Norman Palardy]Yes
As big as there is available memory in a 32 bit process[/quote]
Is this still the case with a 64-Bit application?

Glad to have helped.

They really dont have limits on them that I know of
So I would expect that this is still true in a 64 bit app

There may be some underlying OS limit I dont know of though

Hum…

When I read a 500 MB into a string it takes 2.35 - 5.7 seconds. Absolutely no “not responding”.

Doing the same to a TextArea takes 51.8 seconds.

Clearly the issue is with loading the control Text property.

You may want to update your TextArea with a timer, in smaller chunks. Even if it takes somewhat longer, it will free the UI.

How have you done this? I need a solution with the new Xojo-Framework.

I need to read large Textfile (95,7 MB and 665,7 MB) and parse every line. I do this in that way:

  1. read Textfile into Xojo.IO.TextInputStream
  2. split into single lines Dim lines() As Text // here program freezes
  3. parse lines into Objects
  4. create a Database-Record for every line begins with “0”

The second way i did was

  1. read Textfile into Xojo.IO.TextInputStream
  2. use RegEx to search for Object-Records (indicated by lines which begins with “0”) // here the program freezes
  3. create for every match a Database-Record

Both ways still working, if the Files are not 100’s of MB.

This loads the entire content of the 217 MB Princess.txt file in about 2 seconds :

[code] dim en as string = endofline
dim eol as Text = en.totext
Using Xojo.Core
Using Xojo.IO

Dim f As new FolderItem("/Users/mitch/Downloads/Princess.txt")

If Not f.Exists Then
’ Cannot read the file if it does not exist
Return
End If

Dim Lines() as Text
Dim errorText As Text
Try
Dim input As TextInputStream
input = TextInputStream.Open(f, TextEncoding.UTF8)
Lines = input.ReadAll.Split(eol)
input.Close
Catch e As IOException
errorText = "File IO Error: " + e.Reason
End Try

system.DebugLog lines.Ubound.ToText[/code]

@Michel Bujardet - Thanks, but thats the way i use. Think the problem is the RAM. I have a Macbook from 2009… So i don’t won’t to exclude users with old Hardware :wink:

I very much doubt it is the RAM. With a 700 MB file, the debug app grows to up 1.09 GB before going back down. That is still far from the 4GB limit of a 32 bit app.