ASCII Encoding question.

Hi,
Quick question:
When you define an encoding as ASCII, in Xojo - is that standard ASCII, extended ASCII, or is irrelevant??

Thanks.

You shouldn’t have to anymore. Xojo defaults it’s text encodings to UTF8 I believe. From what I recall from posts throughout the forum, you should only have to define encodings if Xojo is reading text wrong from a file.

I am simply trying to fully understand all about encodings in Xojo, therefore, I am unsure if it will be standard or extended.

ASCII encoding is 7bit… 0x00 to 0x7F so basically control characters, numbers, upper and lower case alphabet, and basic typewriter characters (!@#$%^&*() etc)

Ok, let me rephrase with a different example:

If I had for example, a textfield which displayed the ASCII of a depressed key - would it only display the standard (first 127), or would it use the extended and also display the extended equivalent of the depressed key?

did ya try it?

No - I am on my iPad and trying to use my travel time wisely :slight_smile:

When you say “display the ASCII of a depressed key”, are you referring to using ASC(Key) in the KeyDown event?

If so, be aware of 2 things. 1) The Key value passed to the event is UTF8 encoded and may be multi-byte. It is not ASCII encoded. And 2) ASC() returns the code point, not the “ascii” value. For “normal” keys (0-127), they are the same, but for “extended” values, you will get the code point in whatever encoding the Key is in. In this case, UTF8.

So, no, you will not get extended ascii, you will get UTF8 code points. (Assuming that I have understood you correctly.)

Oh dear.
I wanted to be able to display the extended ASCII of whatever key was pressed.

For example:
In the key down event of a textfield I call a method which contains the following code:

// DISPLAY THE ASCII KEY CODE Dim ConvertedToASCIIEncoding as String = DefineEncoding(inStringToConvert,Encodings.ASCII) Dim ConvertedASCII_Code as Integer = Asc(ConvertedToASCIIEncoding) AsciiField.AppendText Str(ConvertedASCII_Code)

  1. Use ConvertEncoding instead of DefineEncoding.
  2. Use an encoding that has an “extended” definition, such as Latin1 or MacRoman. The same key will have a different value (or none at all) depending on the encoding you convert it to.
  3. Which concept of “extended ascii” are you interested in?

Do you mean simply like this:

// DISPLAY THE ASCII KEY CODE Dim ConvertedToASCIIEncoding as String = ConvertEncoding(inStringToConvert,Encodings.MacRoman) Dim ConvertedASCII_Code as Integer = Asc(ConvertedToASCIIEncoding) AsciiField.AppendText Str(ConvertedASCII_Code)

[quote=110156:@Richard Summers]Hi,
Quick question:
When you define an encoding as ASCII, in Xojo - is that standard ASCII, extended ASCII, or is irrelevant??

Thanks.[/quote]

  1. there is no such thing as “ASCII extended” - ascii is ascii (0-127) and NOTHING above that.
  2. there are lots of other single byte encodings that use the ascii 0-127 range exactly as is and also define the remaining 128 and up - but they are not “extended ascii” despite what people refer to them as - they are encodings in their own right.

Now I am totally confused :frowning:

:wink:

Read these and then see if that helps
http://www.realsoftwareblog.com/2013/01/encodings-what-are-they.html
http://www.xojo.com/blog/en/2013/08/why-are-there-diamonds-in-my-user-interface.php

I had a reasonable understanding, and then you informed me that extended ASCII does not really exist.

That then confused me, as everywhere you seem to look - it says that standard ASCII uses (0-127), and that extended ASCII uses 128 upwards.

I therefore presumed that the € symbol would be standardised if extended ASCII was used - but now there is NO extended ASCII??
:frowning:

“Extended ASCII” really refers to a code page - that translates the 8-bit binary number to a font glyph. Each code page translates 0-127 the same (they are standardized), but they translate 128-255 differently. You may find the € symbol in some code pages but not in others. There is not standard for “extended ascii”, and that is precisely why unicode was created, so there would be a reliable translation between a given code point and a particular glyph.

Here is an interesting read on ASCII: http://en.wikipedia.org/wiki/ASCII

That is 100% correct. There is no standard for that. Everyone does what is right in his own eyes. The byte values from 128 to 255 are like the wild west. Anything goes. IBM does one thing. Microsoft does something else. Apple does something else entirely.

Thanks Tim - I will forget the idea then, as it is too unreliable.

Thanks for pointing that out :slight_smile:

[quote=110256:@Richard Summers]I had a reasonable understanding, and then you informed me that extended ASCII does not really exist.
[/quote]
No - EXTENDED ascii doesn’t as there are many encodings that have the ascii character set as their initial 128 entries but many different ones for the remaining 128 (chars with code points 128 to 255)

[quote=110256:@Richard Summers]That then confused me, as everywhere you seem to look - it says that standard ASCII uses (0-127), and that extended ASCII uses 128 upwards.
[/quote]
That is just people who have no clue about what they’re talking about :stuck_out_tongue:

[quote=110256:@Richard Summers]I therefore presumed that the € symbol would be standardised if extended ASCII was used - but now there is NO extended ASCII??
:([/quote]

There are many encodings that have the first 128 ascii characters as their first 128
UTF8 is that way and it actually even represents those in the exact same way as ascii - 1 byte.
ISO-Latin1 (8859-1) is a single byte encoding (all characters it can represent consume 1 byte) and it is one that is sometimes referred to as “extended ascii” - it has the same base 0-127 characters but then unique ones from 128-255 code point values from 0 to 255

Another known as MacRoman is VERY similar - but not identical - some characters will vary

And heres a whole bunch more that could be referred to as “extended ascii”
http://en.wikipedia.org/wiki/Category:Mac_OS_character_encodings
http://en.wikipedia.org/wiki/Category:Windows_code_pages
they all have the same characters from 0 to 127 but after that you HAVE to know what the encoding is to get the right character

HOWEVER, all this aid in Xojo you can get the Euro character just by using its unicode code point by
http://en.wikipedia.org/wiki/Euro_sign ( EURO SIGN and the code position U+20AC )

dim s as string = &u20ac

That is ALWAYS the Euro sign
NOW what can happen is you can display it in a font that does not have a glyph for that so you get something else.

But s would still always have the euro sign

Fonts may include or not include glyphs for everything

If you really want your eyes to gloss over you could read the Unicode Consortiums references on all this :slight_smile: