Where do I start looking?

  1. 2 weeks ago

    I have a programme that parses third party .txt files and loads the content into the database.
    This all works fine, but every now and then it does not. A single file is not interpreted correctly.
    Workflow:
    A timer checks the download folder every 10,000 ms
    the .txt file is structured, each line has a marker that indicates the type of information that follows
    Before I work with the file I save a copy to a special folder
    I need to read the entire text file, before I can load the lines into the db
    I assemble the information with the help of a dictionary (59 values) and SQLite with 4 tables (up to 40 fields)
    Generally there are only a few records for each text file, hardly ever more than 10
    After the push to the db, the loop ends
    The next loop starts with resetting the dictionary values to "" and deletes all records in SQLite

    Every now and then, say three or four times a week, a file is not properly interpreted. Some data is missing.
    There is no pattern to the missing data.
    There are no error messages.
    Originally I thought of a Memory leak.
    But when I re-feed the file from the back-up into the running programme, it works properly.

    Does anyone have a suggestion on where I should start looking for the bug
    or do I have to live with it?

  2. Ivan T

    Aug 4 Pre-Release Testers
    Edited 2 weeks ago

    So, you populate a database, and then you delete those records?

  3. Greg O

    Aug 4 Xojo Inc

    You might want to figure out a way to determine if the file has finished downloading. It could be that the file is in only partially downloaded when the timer fires. Maybe remember the first time it appears, and then keep checking to see if the file’s size has changed since the last check.

  4. Ivan,
    I delete the records in SQLite after they have been loaded to the "real" db

  5. Greg,
    the timer starts a loop
    Does the timer fire even if the loop is not finished?
    If yes, can I pause the timer while the loop is running?

  6. Kevin G

    Aug 4 Pre-Release Testers, Xojo Pro Gatesheed, England

    It could be that you have detected and read a file at the same time it is being created.
    When you detect a new file you could also make sure it is older than 1 minute before you try to process it.

  7. Alexander v

    Aug 5 Europe (Houten, The Netherland...

    Do you keep a log file about what file is opened, which line is read, the parsing methods etc?

  8. thanks everyone. You enforced my suspicion that it could be the Timer
    The first line in my loop now disables the Timer and the last line enable it.
    Fingers crossed...

  9. @Alexander vnbsp;der Linden Do you keep a log file about what file is opened, which line is read, the parsing methods etc?

    Hi Alexander
    no, I do not. This parsing is only a transition of data. I store the text file and store the result in the real db
    But, you are correct. If the Timer (see above) does not fix it, I will have to do this - it will be a lot of work...

  10. Kem T

    Aug 5 Pre-Release Testers, Xojo Pro, XDC Speakers Connecticut

    Nowhere did you mention Threads, so I'm assuming you're not using them and all your code is running on the main thread. In that case, no, a Timer will not fire during the loop. A Timer always runs on the main thread and only when the main thread is otherwise idle.

  11. @Kem T Nowhere did you mention Threads, so I'm assuming you're not using them and all your code is running on the main thread. In that case, no, a Timer will not fire during the loop. A Timer always runs on the main thread and only when the main thread is otherwise idle.

    You are correct, Kem
    There is only one thread looping until all the files that were found are processed.
    The files are small, usually around 2kb, but contain a lot of diverse information.
    Each line has a different meaning, so there is a lot of coding required.
    As you, and the language reference, say Timers run on the main thread.
    That is why I had not had thought about deactivating the thread.

  12. Kimball L

    Aug 5 Pre-Release Testers, Xojo Pro Meridian, ID, USA

    Is there *any* chance that the files to be processed can change *while* the processor is running? i.e., can a file be getting written / updated / deleted by some external process independent of what your processor is doing? If so, you'll need to find a way to guarantee that the file you are processing is complete and has not changed since your processor started.

    Greg hinted at this above, but I'll reinforce the idea with an example I've had to deal with: I have an app that can download a library of videos for the user to play in the app. If a video is only partially downloaded, then the MoviePlayer barfs and refuses to play the file. As a result, my app needs to first ask my server for an MD5 sum of the video file(s) it is about to download, then perform the download, and validate the file by checking the md5 sum of the downloaded file against the one the server provided. If the download is finished but the md5 sums don't match, there was an issue with the transfer and the video file is deleted so my app can try again the next time the downloader timer fires.

    I recognize that your files are tiny, but network issues or temporal issues (timing of when a file is being written by process a vs when it is read by process b) could be the culprit.

  13. Ivan T

    Aug 5 Pre-Release Testers

    @Gerd W thanks everyone. You enforced my suspicion that it could be the Timer
    The first line in my loop now disables the Timer and the last line enable it.
    Fingers crossed...

    Yeap, that was bad. Take a look at the ModeSingle of the timer

    @Gerd W Ivan,
    I delete the records in SQLite after they have been loaded to the "real" db

    Not sure if it can cause the problem, but if the database os just temporary, it is better to delete de file and a lot better to use a in memory database.

    To track down the bug, add some debug data to your database, like how many data was readed from the file and how many records were put in the database, etc. That way you can narrow the error to the file, the database and so on

    Do you use transactions for the inserts?

or Sign Up to reply!