Character is Emoji?

Hi everyone,

I am looking for a good way to check, if a character within a String/Text is a Emoji. If yes, i would like to get the Hex-Code.

For Each c As Text In myTeyt.Characters ....? Next

Or better to use

For Each c As UInt32 In myText.CodePoints ...? Next

Best is to get UTF32 and look for characters > 65536

This should do what you need:

dim rx as new RegEx
rx.SearchPattern = "\\p{So}"

dim match as RegExMatch = rx.Search( myText )
while match isa RegExMatch
  dim char as string = match.SubExpressionString( 0 )
  // Do something with it

  rx.Search
wend

Keep in mind that RegEx only deals with strings right now.

http://unicode.org/emoji/charts/full-emoji-list.html
get all the code points and check for those in the emoji range &u1F600 - &u1f1ff
not sure if that is one contiguous range

It’s certainly not since 0x1F600 > 0x1F1FF. :slight_smile:

I did the same thing when I looked at that chart. There is, unfortunately, no great solution. The regex pattern I posted, for example, will match symbols other than just emoji.

Thanks to all of you. I try your code Kem…
Will &u1F600-&u1F1FF be the valid range For Emojis only?

[quote=263909:@Martin Trippensee]Thanks to all of you. I try your code Kem…
Will &u1F600-&u1F1FF be the valid range For Emojis only?[/quote]
It doesn’t look like it is a contiguous run - no :frowning:

See http://www.unicode.org/charts/ “Emojis and pictographs”. There are 6 charts.

Ok, to go a bit more into detail:

If you add different Emojis into a Microsoft Word-Document, Word generates the following Xml-Structure:

Sample: Dim s As Text = "My PIGNOSE Test FLAGMAN" (Sorry the forums Editor do not allow to add Emojis, please imagine :wink: )

<w:r> <w:t>My </w:t> </w:r> <w:r> <w:rPr> <w:rFonts w:ascii="Apple Color Emoji" w:eastAsia="Apple Color Emoji" w:hAnsi="Apple Color Emoji" w:cs="Apple Color Emoji"/> </w:rPr> <w:t>PIGNOSE</w:t> </w:r> <w:r> <w:t> Test </w:t> </w:r> <w:r> <w:rPr> <w:rFonts w:ascii="Apple Color Emoji" w:eastAsia="Apple Color Emoji" w:hAnsi="Apple Color Emoji" w:cs="Apple Color Emoji"/> </w:rPr> <w:t>FLAGMAN</w:t> // FLAG has a of 2 Bits length!!! </w:r>

I am looking for a solution to generate this Xml-Structure by using the new Xojo-Framework.

[code]Dim codePoint, UintCP As UInteger

For Each char As Text In s.Characters
Dim runNode, textNode, textElement As XmlNode

For Each UIntCP In char.Codepoints
codePoint = UIntCP
If codePoint >= 128061 And codePoint <= 128373 Then
// If codePoint >= &h1F600 And codePoint <= &h1F64F Then ???
// generate “Apple Color Emoji” XmlNode
Else
// generate normale XmlNode without <w:rPr>
End If
Next
Next[/code]

I need to know, whats the valid Codepoint-Range for Emojis…

Maybe someone can follow me :slight_smile:

Thx

That is precisely what the Unicode charts I pointed to will give you… The range &h1F600 - &h1F64F corresponds to emoticons. Have a look.

Sorry Michel, you are right. Thank you. Look here for the valid Unicode-Ranges:

Select Case codePoint Case 9728 To 9983, _ // Miscellaneous Symbols 9984 To 10175, _ // Dingbats 127744 To 128511, _ // Miscellaneous Symbols and Pictographs 128512 To 128591, _ // Emoticons 128640 To 128767, _ // Transport and Map Symbols 129280 To 129535 // Supplemental Symbols and Pictographs // Emoji-XmlNode Else // normal TextNode End Select

What, if an Emoji has 2 Codepoints like the “BAHAMAS”-Flag (&u1F1E7 = 127463 and &u1F1F8 = 127480)?

What would be the problem ? The system will simply treat them as two different glyphs.

The “problem” is, the two codepoints are not within the valid “Emoji and pictograph”-Range. Looks like, there will be more Emojis! So the code would match as “normal TextNode”, what’s wrong :wink:

Then just add these cases, right ?