I have a programme that parses third party .txt files and loads the content into the database.
This all works fine, but every now and then it does not. A single file is not interpreted correctly.
A timer checks the download folder every 10,000 ms
the .txt file is structured, each line has a marker that indicates the type of information that follows
Before I work with the file I save a copy to a special folder
I need to read the entire text file, before I can load the lines into the db
I assemble the information with the help of a dictionary (59 values) and SQLite with 4 tables (up to 40 fields)
Generally there are only a few records for each text file, hardly ever more than 10
After the push to the db, the loop ends
The next loop starts with resetting the dictionary values to “” and deletes all records in SQLite
Every now and then, say three or four times a week, a file is not properly interpreted. Some data is missing.
There is no pattern to the missing data.
There are no error messages.
Originally I thought of a Memory leak.
But when I re-feed the file from the back-up into the running programme, it works properly.
Does anyone have a suggestion on where I should start looking for the bug
or do I have to live with it?
You might want to figure out a way to determine if the file has finished downloading. It could be that the file is in only partially downloaded when the timer fires. Maybe remember the first time it appears, and then keep checking to see if the files size has changed since the last check.
no, I do not. This parsing is only a transition of data. I store the text file and store the result in the real db
But, you are correct. If the Timer (see above) does not fix it, I will have to do this - it will be a lot of work…
Nowhere did you mention Threads, so I’m assuming you’re not using them and all your code is running on the main thread. In that case, no, a Timer will not fire during the loop. A Timer always runs on the main thread and only when the main thread is otherwise idle.
You are correct, Kem
There is only one thread looping until all the files that were found are processed.
The files are small, usually around 2kb, but contain a lot of diverse information.
Each line has a different meaning, so there is a lot of coding required.
As you, and the language reference, say Timers run on the main thread.
That is why I had not had thought about deactivating the thread.
Is there any chance that the files to be processed can change while the processor is running? i.e., can a file be getting written / updated / deleted by some external process independent of what your processor is doing? If so, you’ll need to find a way to guarantee that the file you are processing is complete and has not changed since your processor started.
Greg hinted at this above, but I’ll reinforce the idea with an example I’ve had to deal with: I have an app that can download a library of videos for the user to play in the app. If a video is only partially downloaded, then the MoviePlayer barfs and refuses to play the file. As a result, my app needs to first ask my server for an MD5 sum of the video file(s) it is about to download, then perform the download, and validate the file by checking the md5 sum of the downloaded file against the one the server provided. If the download is finished but the md5 sums don’t match, there was an issue with the transfer and the video file is deleted so my app can try again the next time the downloader timer fires.
I recognize that your files are tiny, but network issues or temporal issues (timing of when a file is being written by process a vs when it is read by process b) could be the culprit.
[quote=448968:@Gerd Wilmer]thanks everyone. You enforced my suspicion that it could be the Timer
The first line in my loop now disables the Timer and the last line enable it.
Yeap, that was bad. Take a look at the ModeSingle of the timer
I delete the records in SQLite after they have been loaded to the “real” db[/quote]
Not sure if it can cause the problem, but if the database os just temporary, it is better to delete de file and a lot better to use a in memory database.
To track down the bug, add some debug data to your database, like how many data was readed from the file and how many records were put in the database, etc. That way you can narrow the error to the file, the database and so on