I have an xml file that is over 10 megs… If I open it with Oxygen XML editor it tells me that it includes some bi-directional text (Hebrew, Arabic, etc…)… but does not identify it… and I need to find it.
Any ideas on how I can do that?
Keep in mind that I do not know which language or if the characters are composed or decomposed.
Try this RegEx pattern:
That means “not Latin unicode”.
(No idea if that will do it for you, just throwing out an idea.)
Let me revise that:
That means, “upcoming character is a letter, but it not Latin”, and will match the entire string.
The Graphics class has a TextDirection function. You could try passing sub strings to that to see what their direction is.
Thanks for the suggestions
… did find some instances of the “micro sign” character (UTF-8: C2B5)… but nothing else.
Keving: I will give that a try.