CHR(8217)

I’m having some issues with chr(8217) in a text area … seems text areas don’t like this CHR (I just get a space)

So I tried replacing Chr(8217) with CHR(39) before it got to a text area
if myfile <> nil then
t = textinputstream.open(myfile)
holdstring = t.readall
holdstring = replace(holdstring,chr(8217),chr(39))
inputtext.text = holdstring
end if

but I’m still just getting a blank space.

help :frowning:

You need to define the encoding of the text when you read it in. Otherwise you have no encoding, just single bytes, and chr(8217) will not display correctly.

Thanks Tim … one more quick question …

How do I “define the encoding of the text when [I] read it in”? :slight_smile:

http://documentation.xojo.com/api/text/encoding_text/defineencoding.html ?

Thanks - OK … ummm … stay with me :slight_smile:

The character that troubles me is the apostrophe is the word below
“Mustn’t” … it’s chr(8217)

So - I adjusted the code as advised here to the below …

if myfile <> nil then
t = textinputstream.open(myfile)
holdstring = t.ReadAll.DefineEncoding(Encodings.UTF8)
holdstring = replaceall(holdstring,chr(8217),chr(39))
inputtext.text = holdstring
end if

My textarea just gives me a question-mark-like looking character …
I tried other types of UTF - but nothing :frowning:
I’m loading from a simple text file if that helps any.

Many Thanks

Try it wth:


mString = Encodings.UTF8.Chr(8217)

Ah - I see that gets chr(8217) in the text area …
But I’d like the whole text … just either WITH 8217 or swapping it to 39
so where does Encodings.UTF8.Chr(8217) fit into the code?

if myfile <> nil then
t = textinputstream.open(myfile)
holdstring = t.ReadAll.DefineEncoding(Encodings.UTF8)
holdstring = replaceall(holdstring,chr(8217),chr(39))
inputtext.text = holdstring
end if

I know I’m being slow :frowning:

Append it to your string.


mString = EncodedStringWithChar + inputdatastring

You can replace the CHR(8217) with Encodings.UTF8.Chr(8217) anywhere as it returns a string

It doesn’t :frowning:

This is the code …

if myfile <> nil then
t = textinputstream.open(myfile)
‘’’’’‘holdstring = t.ReadAll.DefineEncoding(Encodings.UTF8)
holdstring = t.readall
‘’’’’'holdstring = Encodings.UTF8.Chr(8217)
holdstring = replaceall(holdstring,chr(8217),Encodings.UTF8.Chr(8217))
inputtext.text = holdstring
end if

But the word which contains 8217 comes out with a form of a ? in the space.

Don’t replace 8217 with 8217…

What char do you want to see? A quotation?

Try
T.readAll(Encodings.UTF8)
Instead

https://documentation.xojo.com/api/files/textinputstream.html#textinputstream-readall

You can set the encoding in which the string is to be read.
Then append and/or replace the characters. It should work.

Tried with both the below - no luck

if myfile <> nil then
t = textinputstream.open(myfile)
holdstring = T.readAll(Encodings.UTF8)
inputtext.text = holdstring
end if

if myfile <> nil then
t = textinputstream.open(myfile)
holdstring = T.readAll(Encodings.UTF8)
holdstring = replaceall(holdstring,Encodings.UTF8.Chr(8217),chr(39))
inputtext.text = holdstring
end if

Juts can’t get Chr(8217) to become chr(39) (it’s an apostrophe by the way)

I suspect your file is not UTF8. With a proper UTF8 file where I copy/pasted your string, this code works.

t = textinputstream.open(myfile)
holdstring = T.readAll(Encodings.UTF8)
inputtext.text = holdstring

What are the actual bytes in the file? UTF8 for chr(8217) is E2 80 99. If your file contains some other byte values, it’s not UTF8.

I don’t see a way here to send you the file itself … but I’ll tell you how I got it.

My students are all living in the UAE (I’m using their data) … so they have Arabic word documents. I copied and pasted from those documents into a txt document (a regular txt). I can see the character there … it’s a Microsoft looking 8217 apostrophe … a “smart” one. I can paste that character to my text area and ask for the ascii, and I get 8217. I do the same with the quotation marks, I get 8220. So, I have no problem pasting the text into the text area. I just can’t directly load it from a txt file … 8217 and 8220 becomes either a “?” or a blank space.

BY the way - re “What are the actual bytes in the file? UTF8 for chr(8217) is E2 80 99. If your file contains some other byte values, it’s not UTF8.”
How do I get that information?

The ? character as you named it is the replacement character, used when the asked character do not exists in the used font (Police set).

Try Times, Arial, a different character in your TextArea to display the file contents.

Better: load the original word document to get the used font (police) name and use that value for your TextArea.

Tried consolas, arial, and times … see code below … no joy :frowning:

inputtext.TextFont = “Consolas”
if myfile <> nil then
t = textinputstream.open(myfile)
holdstring = T.readAll(Encodings.UTF8)
'holdstring = replaceall(holdstring,Encodings.UTF8.Chr(8217),chr(39))
inputtext.text = holdstring
end if

Somewhat new to Xojo so not here sure how to load a word doc … I’ll be working on that

I do not read that.

There are Xojo Classes to deal directly with Microsoft Word.

You may do that of simply export the file (from Word) to regular txt, then load that txt into xojo’s TextArea.

Also: you may create a simple project that only loads your text file, pack it with that txt file into an archive and share the whole: it will be far better to (try) understand what happening and give a better advice.

Edit:
http://documentation.xojo.com/api/windows/wordapplication.html

WordApplication is reserved to the Windows Platform.

You lost me Emile … but I do appreciate your and everyone’s help.

If I ever work this out … I’ll post what I did.

Thanks all!

[quote=439214:@Philip McCarthy]BY the way - re “What are the actual bytes in the file? UTF8 for chr(8217) is E2 80 99. If your file contains some other byte values, it’s not UTF8.”
How do I get that information?[/quote]
Put a break point after you have read the file and then examine the variable in the debugger. Click on the string and there will be a tab to see the Binary values.

textinputstream is UTF8 … is that what you meant?