Create a UTF-8 file with BOM

And what about when open a file, read its content, and writing the content to a new file, and need to have the same encoding/bom etc. as the source file?

If this is something you need to do consistently, consider creating a class that handles it for you. If you do it right, it will be invisible to you.

Yes, I need to do that all the time, I have to do work with lots of files, and create also lots of new files. I did that until now in Delphi, where it is very easy, but I get headache with Xojo.

Every language has its advantages and disadvantages over others.

BTW, a BOM for UTF-8 files is not required or recommended. See:

Page 36:

The main purpose of the BOM, or Byte Order Mark, is to indicate the byte order in encodings where it might be different, e.g., UTF-16LE vs. UTF-16BE. UTF-8 byte order never changes.

The UTF-8 BOM can also indicate that the data IS UTF-8 although that is NOT the primary usage
Reading a UTF-8 text file in an encoding like ISO Latin 1 is entertaining and confusing
BBEdit likes to do stuff like this

[quote=441900:@Emile Schwarz]What happened when you saved a text from a TextArea (with a UTF-8 encoding) ?

Did you try that ?[/quote]
and read it back in adifferent textarea ?

Try an report result, please.

I am not reading or writing text from/to text areas, I read text from files into string array, process the data line by line, and write it back, but the encoding could be ansi, utf-8 (BOM EF BB BF) or UTF-16BE (BOM FE FF) / UTF-16LE (BOM FF FE) and other, whatever the user select for the export, so I need to be able to write a BOM.

[quote=441856:@Christian Schmitz]I think you can just do a write as the first thing after creating file:

t.Write encodings.UTF8.Chr(&hFEFF)

This should work.

The following might help:

  1. Writing Files
    • Make sure your content is the correct encoding. If not perform a ConvertEncoding on it.
    • Write as a binary file rather than a text file.
    • Write one of the following before your content:
    UTF8: ChrB(&hEF) + ChrB(&hBB) + ChrB(&hBF)
    UTF16BE: ChrB(&hFE) + ChrB(&hFF)
    UTF16LE: ChrB(&hFF) + ChrB(&hFE)
    UTF32BE: ChrB(&h00) + ChrB(&h00) + ChrB(&hFF) + ChrB(&hFF)
    UTF32LE: ChrB(&hFF) + ChrB(&hFF) + ChrB(&h00) + ChrB(&h00)
    • Write your content
    • Close

  2. Reading Files
    • Use a binary stream to read the first 4 bytes.
    • Perform a comparison on the bytes to determine if they represent a BOM. This will tell you the Xojo text encoding to use and the offset to the start of the content.
    • Open the folderitem as a text stream
    • Set the encoding property on the text stream
    • Set the PositionB property to the offset
    • Read the content
    • Close
    • If necessary, convert the encoding of the data you have read back to your working encoding (eg: UTF-8).

[quote=442216:@Kevin Gale]The following might help:

  1. Writing Files
    • Make sure your content is the correct encoding. If not perform a ConvertEncoding on it.
    • Write as a binary file rather than a text file.
    • Write one of the following before your content:
    UTF8: ChrB(&hEF) + ChrB(&hBB) + ChrB(&hBF)
    UTF16BE: ChrB(&hFE) + ChrB(&hFF)
    UTF16LE: ChrB(&hFF) + ChrB(&hFE)
    UTF32BE: ChrB(&h00) + ChrB(&h00) + ChrB(&hFF) + ChrB(&hFF)
    UTF32LE: ChrB(&hFF) + ChrB(&hFF) + ChrB(&h00) + ChrB(&h00)
    • Write your content
    • Close

  2. Reading Files
    • Use a binary stream to read the first 4 bytes.
    • Perform a comparison on the bytes to determine if they represent a BOM. This will tell you the Xojo text encoding to use and the offset to the start of the content.
    • Open the folderitem as a text stream
    [/quote]

Why bother with that? You can use the Binary Stream to read in Text with an encoding:

Dim theContent as String = BS.Read(theFolderItem.Length-4, theEncoding)

[quote=442266:@Karen Atkocius]Why bother with that? You can use the Binary Stream to read in Text with an encoding:

Dim theContent as String = BS.Read(theFolderItem.Length-4, theEncoding) [/quote]
I must have missed that you could read with an encoding when using a binary stream.
However, you would have to set the read position as if the BOM was UTF-8 you would have to rewind 1 byte before reading the content.

UTF-8 BOM is three bytes.

[quote=442285:@Kevin Gale]I must have missed that you could read with an encoding when using a binary stream.
However, you would have to set the read position as if the BOM was UTF-8 you would have to rewind 1 byte before reading the content.[/quote]

Not a problem.

BS.Position = BS.Position -1

Sorry for being late. Here is a link to an article from 2017 on BOM with an example program.

How to Implement a Byte Order Marker BOM with Xojo

Maybe this will help?