Postgre INSERT Error 'invalid byte sequence for encoding "UTF8"'

I’m using Xojo 2015r2.2/PostgreSQL 9.4 on Windows 8.1 to import a fixed length field text file with ca 300+ records from a Windows application of unknown origin into a PostgreSQl database

My PostgreSQL database properties are: ENCODING = ‘UTF8’, Postgres client_encoding = UNICODE

The import routine errors on a string containing the text ‘Ciarn’. The database field type was changed from text to varchar() - but still errors.

I’ve trapped the ‘INSERT’ statement from Xojo, containing ‘Ciarn’, and successfully passed it into the Postgres database using pgAdmin III. The same text file imports into SQLite if I set the TextInputStream encoding to ISOLatin1 but this encoding still errors with PostgreSQL.

In Xojo I have …
i) assigned the TextInputStream encoding to Encodings.UTF8 before ReadAll into a string variable - ‘s’ (see EntryFile.Constructor in project ‘EncodingsIssue’ )
ii) tried ConvertEncoding to UTF8 on the string variable - ‘s’ that the reads the TextInputStream
iii) tried DefineEncoding to UTF8 on the string variable -‘s’
iv) tried trapping the string containing ‘Ciarn’ and then using ConvertEncoding/DefineEncoding to UTF8 (see EntryFile.FirstName() ‘EncodingsIssue’)
v) checked the string containig ‘Ciarn’ with Encodings.UTF8.IsValidData() - always ‘False’
vi) Checked the TextEncoding of the string - ‘s’ (see included project - GetEncodings) which is …
Base = 265, Format = 2, Variant = 0, Code = 134217984, InternetName = UTF-8

Here’s the projects, with all the things (as above) I’ve tried (PostgreSQL install not included) https://dl.dropboxusercontent.com/u/8841955/EncodingIssue.zip .
Note: after the the project ‘EncodingsIssue’ errors with ‘ERROR: invalid byte sequence …’ the UI goes ‘Not Responding’ and should be closed manually.

I know almost nothing about encodings and I’ve run out of ideas. Can anyone suggest something else to try or tell me what I’m doing wrong with the encodings. Many thanks

Yeah, that worked because you pasted the string into PGAdmin and by doing this you changed it’s encoding to UTF8. Your problem is basically that you have no idea what encoding your original file is in.

And that is a good thing. When you say:

Dim t As TextInputStream = TextInputStream.Open(folEntryFile)
t.Encoding = Encodings.UTF8

what you are doing is claiming to know what the encoding of the file is - UTF8 - and asking Xojo to handle it accordingly. Postgres is kind enough to tell you that you are wrong on this account. Xojo too, btw, as can be seen in your point v). Just SQLite does not.

If I were in your shoes I’d try a number of different encodings on this ‘Ciaran’-String just as you did with Encodings.UTF8.IsValidData() until you know what encoding your original file is in. Googling suggests that opening the file in notepad++ is an established way to tackle those encoding issues.
What you did (checking the encoding of this s-string) is no valid test, because you are asking Xojo what encoding the string is in only after you told it the encoding is UTF-8. Got it?

Thank you for the insights and ideas. The Notepad++ idea was an eyeopener. Thanks again. Judith