TextOutputStream saves sequence of NULL bytes

This question concerns the following code:

[code]dim f as folderItem = get_folder_item_of_preferences_file
dim t as textOutputStream = f.CreateTextFile
if t=nil then return

t.write get_preferences_string //get_preferences_string returns a string of characters, which is guaranteed to not contain NULL bytes.
t.close[/code]

Problem : on the occasional user’s machine, the resulting file contains several thousand NULL bytes.

It has not yet been possible to reproduce this issue in-house at all, but it is observed via error reports from some customers.

At a guess, either CreateTextFile, or TextOutputStream.Write, or TextOutputStream.Close is crashing the process after space has been allocated on disc for the file, but before the correct data was written.

Question:
How does one create conditions under which this anomaly can be reproduced?

I seen no IOException Handling in your code. Maybe it’s better to not guess when or why it’s failing but to catch the Issue when it’s happening and to act accordingly? :slight_smile:

http://documentation.xojo.com/index.php/IOException

How about an encoding issue? I see no defined encoding here. If the user’s machine does use UTF-16 it would explain the NULL bytes…
Set UTF8 encoding for the text stream.

WHy do you thin you have Nul characters in your text file ?

Do you try to store ASCII characters only ?

Do you try:

t.Encoding = Encodings.UTF8 // strings are UTF8

Yes it would certainly be good practice to check for exception here and I may make this change in the longer term. In the short term I’d love to understand how it is possible to obtain the behaviour being reported from the customer.

[quote]How about an encoding issue? I see no defined encoding here. If the user’s machine does use UTF-16 it would explain the NULL bytes…
Set UTF8 encoding for the text stream.[/quote]

Good idea. However, the resulting file is literally a sequence of approx 17000 null bytes (and no other data), and so does not appear to be a UTF-16 encoded string.

[quote]Do you try:

t.Encoding = Encodings.UTF8 // strings are UTF8[/quote]

Unfortunately this is a ‘classic’ TextOutputStream, not a Xojo.IO.TextOutputStream, which unless I am mistaken, does not have an Encoding method or property.

there are two possible things I can think of… and one may not apply to modern file systems

  • Disk blocking factor. A file can only consist of a MINIMUM of “x” blocks regardless of its contents.
  • Use WRITELINE instead of WRITE

If the file is (for example) a minimum of 4k… and you write 100 characters, the file is "padding with 3094 nulls
If the READLINE fails to determine the END OF LINE then it might read the entire block END OF FILE
Perhaps using WRITELINE will help

t.Write(ConvertEncoding(get_preferences_string, Encodings.UTF8)) :wink:

I bet he writes out UTF16 or UTF32, which does contain by definition null characters for ASCII text when read as UTF-8…

Thanks for the responses.

I believe that the answer may be that when power to the PC is cut during a call to TextOutputStream.Write, then NULL bytes are result when the PC is powered on again.

On macOS I have experienced files starting with lots of NUL bytes that are produced if file is deleted or truncated from an outside process while a TextOutputStream is still open for them. The next write will then happen at the seek-position just after the last write, which results in lots of NUL bytes before if the file was truncated in between. There is no Exception nor Error to detect this: <https://xojo.com/issue/43464>