[quote=441862:@Christian Schmitz]You need to write the BOM if you want it.
Then for all other write calls, always use ConvertEncoding to make sure it’s an UTF8 string.
Just writing some string may not give right encoding.[/quote]
Isn’t the UTF-8 BOM 0xEF 0xBB 0xBF and not 0xFe 0xFF?
Well, it’s &hFEFF for the magic character. Depending of the encoding, it’ll be FE FF for UTF-16 and EF BB BF for UTF-8.
But you don’t need to know those details of the byte representations.
Try it:
Dim s As String = encodings.UTF8.Chr(&hFEFF)
MsgBox EncodeHex(s)
Protected Function BOMUTF8() as String
static r as string = DefineEncoding( ChrB( &hEF ) + ChrB( &hBB ) + ChrB( &hBF ), nil ) // If you define it as the encoding, you can't properly add it to a string
return r
End Function
[quote=441870:@Christian Schmitz]Well, it’s &hFEFF for the magic character. Depending of the encoding, it’ll be FE FF for UTF-16 and EF BB BF for UTF-8.
But you don’t need to know those details of the byte representations.
Try it:
Dim s As String = encodings.UTF8.Chr(&hFEFF)
MsgBox EncodeHex(s)[/quote]
except that for utf16 you get the same bom for UTF16, UTF16LE and BE which is wrong
On the contrary, it’s quite easy, you just have to do it yourself. Write the BOM (I supplied the code above), then write your text. When reading back, strip the BOM, then define the encoding of the rest as UTF-8.
Usually, I read from the official owner, never elsewhere (elsewhere can be wrong: read what some say about tha ASCII table, about the characters values from 129 thru 256)Â…
UTF-8 is a variable width character encoding capable of encoding all 1,112,064[1] valid code points in Unicode using one to four 8-bit bytes.[2] The encoding is defined by the Unicode Standard, and was originally designed by Ken Thompson and Rob Pike.[3][4] The name is derived from Unicode (or Universal Coded Character Set) Transformation Format – 8-bit.[5].