Hi,
for a parsing I need to read a text file (up to some MB) char by char and appending the char to a string when some conditions are satisfied.
At present I save the whole text file in a string and I loop on in using Mid(i, 1). The code below (a standalone and simplified version of what I use) works but it is extremely slow. As you can test by yourself, using the attached input file, it takes ~30 s to process a 150 kB file (!).
How can I improve the speed? Ideas?
Thanks.
Test file: https://www.dropbox.com/s/ths009g6esr0b8c/testFILE.txt?dl=0
#pragma disableBackgroundTasks
Dim time As Double = Microseconds
Dim f As New FolderItem("testFILE.txt", FolderItem.PathTypeNative)
if f = Nil OR f.Exists = false then Return
Dim t As TextInputStream
t = TextInputStream.Open(f)
t.Encoding = Encodings.UTF8
Dim mInput As String = t.ReadAll
Dim rg as New RegEx
Dim myMatch as RegExMatch
rg.SearchPattern = "\\s+"
rg.ReplacementPattern = " "
rg.Options.ReplaceAllMatches = True
mInput = rg.Replace(mInput)
Dim c As String
Dim entry As String
Dim found As Boolean = False
Dim paren As Integer = -1
Dim quotes As Integer = -1
for i As Integer = 1 to mInput.Len
c = mInput.Mid(i, 1)
if c = "@" then
if not found then
// here I initialize a new object
found = true
paren = -1
quotes = -1
end if
entry = c
continue
end if
if c = """" then
if quotes = -1 OR quotes = 0 then
quotes = 1
elseif quotes = 1 then
quotes = 0
end if
entry = entry + c
continue
end if
if c = "{" then
if paren = -1 then
paren = 1
else
paren = paren + 1
end if
entry = entry + c
continue
end if
if c = "}" then
if paren = -1 then
Return
else
paren = paren - 1
if found AND paren = 0 AND (quotes = 0 OR quotes = -1) then
found = false
// here I add the object to an array
entry = ""
continue
end if
end if
entry = entry + c
continue
end if
if c = "," then
if found AND paren = 1 AND (quotes = 0 OR quotes = -1) then
// here I fill the object
entry = ""
continue
end if
entry = entry + c
continue
end if
entry = entry + c
next
MsgBox Str((Microseconds-time )/1000000) + " seconds"