We have discovered ways of manipulating the encoding of source code files so that human viewers and compilers see different logic. One particularly pernicious method uses Unicode directionality override characters to display code as an anagram of its true logic. We’ve verified that this attack works against C, C++, C#, JavaScript, Java, Rust, Go, and Python, and suspect that it will work against most other modern languages.
Just looking through the paper and trying something simple, xojo doesn’t render the unicode correctly so instead of seeing abcdef (yes there is unicode in this post, %u2067%u2066abc%u2069%u2066def%u2069%u2069) you see abcdef which would mean you couldn’t hide the change while using the xojo ide. That being said, if the code went downstream for checking for example to a source control system that rendered it correctly, it might be overlooked and nefarious code could be let through.
I just tried the example from @GarryPettet post about the String.Length issue. It gives 21 as Garry said. The “Clean Invisibles” command does detect those characters, which are multi-byte UTF-8 characters.
You can move stuff around so it visually renders in one way but the actual code sent to the compiler reads another way, so you can fake a change and have it push into live code.
It’s really bad for bad actors that have trusted access to source in a corporation as those checking the changes over might not even know they are letting through a new zero day exploit.
I remeber many years ago finding a Trojan on the windows install cd, they were pressed in Europe, so even if your produce a good product you have to secure end to end delivery. Don’t expect hashes will help, they won’t.
I found another “neat” Xojo issue with non-visible characters yesterday. Turns out that you can name a property of a class with text with non-visible characters (did this by mistake, pasting the property name into the IDE). You then can’t access said property by typing it’s visible name.