Of course, I supposed that
Could my existing project handle RegEx different from a new project?
Thanks for the effort, the result is what I need, but I have to make it work with a generic existing RegEx option in my app…
it was not my effort, but a chatGPT one …
Try it once more with “my code is faultive” in mind and if it is the case, you will now find the error…
You may start by replacing ComboBox with a string with u umlaut… (ü)…
Then modify each and every line …
This one too I guess
This is super crazy: if in the code I type ü in the text to be processed, the code works. If I copy a file name with an ü and paste the it into the code, it fails. Maybe time for lunch.
Test confirms: a typed ü is different from a ü copied from the finder. This blows my mind.
In a normal s.replace function, this does not matter. But in RegEx it does.
There are more than one “ü” depending on the encoding. Only in UTF-8 there are 2.
ü as one codepoint, and ü as u codepoint + combining ¨ codepoint.
They look the same, but internally are different byte codes.
Not crazy: there is a difference between the two…
ü typed as item name in the Finder (*) is <> from ü typed in Xojo who is UTF-8 (AFAIK).
I do not know the “Finder Encoding”…
- Read with FolderItem1.Display…
Go Figure !
I’m pretty sure you can’t replace multiple characters with others in ONE regex
except if @Kem_Tekinay proves the contrary (and he is capable of…)
you need one regex for each kind of characters you want to replace
ex: [ÀÁÂÃÄÅ] → A, [ÈÉÊË]->E, etc…
What I found in Xojo: the Encoding of a text (• test ü ä 123) typed and copied from the finder are both UTF-8. But the length/length(bytes) is different: 14/18 vs 16/20. Does this information lead somehow to a solution?
The web shows these kinds of examples:
var article = a.replace(/ |\./g, "_").replace(/\r?\n|\r|\$|\#|\[|\]/g, "")
But I don’t know if Xojo RegEx can handle this, appears not to be standard.
As already said, looks the same they aren’t:
Const addsDiaeresis = &u0308
Var uUmlautPreComposed As String = "ü"
Var uUmlautComposite As String = "u"+addsDiaeresis
Can I convert one into the other so I can use them as expected?
Replace both.
User won’t
Your code can.
I am not going to code every single character in any format, text must be unified.
Ok
I appreciate your effort!