Encountered invalid character

Hi to everybody and happy new year.
i’ve a problem importing a text file in my posgresql database using TextInputStream

i read lines using ReadLine ()

In a first moment i got the error in title because also in debug mode i saw this charachter in the string

but after reading lines using ReadLine (encodings.ASCII)

i got the same error also if i see a normal character in the string. May i avoid the problem or discard the wrong lines in alternative? Regards

I’m pretty sure your postgres is UTF8 by default ? so you must use encodings.utf8 with readline
and also use the convertencoding function to convert what is from the postgres to utf8 that are in xojo strings.

hi jean this was my first option i got the error in the first picture

The character you said is a “invalid character” is represented by the undefined character. An undefined character is what is displayed when the character does not exists in the asked font.

That said, you may be right.

In the debugger, check its hex value. If it is below 7F (its encoding is ASCII / UTF…). In that case, it may be a Controil character (ctrl-R ?) and you (nearly) may correctly name it ‘invalid’. Check why you have a control character there.

Of course, I may be wrong.

thanks emile might you write a small if statement sample about this hex value checking?

The check functions is safe from exception?

What is s ? It should not contain the lozenge. You should DefineEncoding it before feeding this to the database.

s = DefineEncoding(s, Encodings.UTF8 )


s is a string variable (dim s as string)

	dim dlg as new OpenDialog
	dlg.Filter = csvType
	f = dlg.ShowModal
	if f <> nil and f.Exists then
			dim tis as TextInputStream = TextInputStream.Open(f)
			while not tis.EOF                                                ' while not end-of-file
					s=tis.ReadLine (encodings.UTF8)   
					fields=Split(s,";")                                            ' put items in fields() array
					dim row As New DatabaseRecord

'error on this line
row.Column(“name”) = convertencoding(fields(1).totext,encodings.UTF8)

Well, then it appears the encoding somehow is not UTF8 in the document.

Where does that document comes from ?

If from Windows, it could be WindowsANSI.

it’s an excel file saved as csv

You have to know the encoding of the CSV file in order to load it properly. Seems like Excel CSV does not use a clear encoding:

I believe this is CP1252, WindowsANSI. I am going to experiment and report.

OK. Just created a new sheet, entered in a cell, and exported to csv(DOS) on Windows.

The appropriate encoding is Encodings.DOSLatin1

FWIW I open directly .xlsx files made on a mac, and they are utf8 I don’t need any conversion.

We are talking about csv.

i solved on pc using paul solution. I’ve opened the csv using notepad with utf8 encode regards to everybody for support

Problem is if you have to use notepad every time, it is rather inconvenient for regular use. For a single one of time, it can work.

Using the proper encoding to read csv documents created with Excel on PC is the proper solution.

Agreed. Closing the question, i discovered an option saving directly to csv format from Excel (web options close to the save button) that allows choosing directly the final encoding (so utf-8)