Dim dlg As New OpenDialog
Dim f As FolderItem
dlg.Title = "Bitte Anhang auswhlen..."
f = dlg.ShowModal
if f <> NIL then
dim before As String
dim after As String
System.DebugLog("f.Name: " + f.Name)
'before = f.Name
before = "geschftsverlauf.pdf"
system.DebugLog("before: " + before)
after = ReplaceAll(before, "", "ae")
after = ReplaceAll(after, "", "oe")
after = ReplaceAll(after, "", "ue")
system.DebugLog("after: " + after)
end if
that works, the last system.debuglog shows this:
after: geschaeftsverlauf.pdf
but when i use an already filled string like f.Name which comes from a FileOpen Dialog, the converting does not work.
I’d guess that this is a problem of precomposed vs. decomposed UTF8. And this is a feature and not a bug. You have to make sure that your strings are normalized. Have had fun like this before, too, and thought I’m going mad because the strings wouldn’t match in a comparison.
i understand, that you want to sell your Plugins. Thats ok, i do this with my products as well. But i hope you understand that i want to realize this without plugins. Beatrix mentioned, that this is not a bug. But when i look at these two variables f.Name and attachment.Name they are both strings. Why is xojo not able to compare two strings? What about the text-type? Can this help to get the strings “normalized”?
The declares into “Carbon.framework” are misleading. Those functions actually live in “CoreFoundation.framework” and the declare works only because Carbon re-exports the symbols from CoreFoundation. The “Normalize” function in the post will keep working in the future, even in 64-bit applications.
I would ignore the suggestion to use ReplaceAllB. Operating on strings at the byte level is generally inadvisable unless you really know what you’re doing or your string comprised of binary data (as opposed to something textual).
[quote=169288:@Michel Bujardet]Michael, I don’t know if you even READ posts made for you. This post explains a big deal, as well as provides a cleaned up file name for you.
At some point, it is extremely frustrating to try to help just to see someone apparently intent on overlooking one’s efforts :/[/quote]
Sorry Michel, didn’t wanted to ignore you. Your suggestion works, it replaces ä with a, the best i have at the moment. But what im searching for i to replace ä with ae. in german language it is not just replace ä with a, this becomes a totally different name. And to answer the question: i do read all posts made for me.
You know, what is most frustrating is that I tried painfully to help, attempted to explain what a composite character was, and you did not even make the elementary effort to ask anything.
If you paid attention to what I have been painfully trying to explain, there are two ways to present accented characters in an UTF-8 string : in one character that has the accent, or in two characters : the letter itself, then the accent.
Here is the solution :
replaceall(name,"a"+chr(776),"ae")
Now, I am done. If you want to do the same for other characters, analyse the string character by character. That’s all I will share since apparently it does not get through.
As Beatrix mentioned, it’s almost certainly due to precomposed versus decomposed characters. This is a quirk of Unicode where some characters can be represented multiple ways. For example, like can be represented in two different ways:
U+00E4 LATIN SMALL LETTER A WITH DIAERESIS
U+0061 LATIN SMALL LETTER A, U+0308 COMBINING DIAERESIS
Simplifying a small bit, string comparison is done by looking at Unicode codepoints. In this case, ReplaceAll is looking for the specific sequence U+00E4 but your string has U+0061, U+0308.
The Text type operates at a higher level than String and treats both representations the same. For example:
Dim precomposedValue As Text = &u00E4
Dim decomposedValue As Text = &u0061 + &u0308
Dim result As Text = precomposedValue.ReplaceAll(decomposedValue, "foo")
// result is now simply "foo"
Normalization refers to the act of taking a series of Unicode codepoints and making all of them being in one specific form. For example, there’s a form of normalization that converts everything to be decomposed characters. There’s no way in Xojo to perform normalization, but the Text type renders it mostly unnecessary.
If you don’t want to use Text, the forum post with declares that was linked to earlier is a good approach. Once you’ve normalized your string, you only have to worry about doing a ReplaceAll on one specific form of your character.
Thanks very much for the explanation, Joe. The solutions Michael provided above works like i need it. But i want to understand where i can find the characternumber like michael posted with chr(776) for the second part of the . Where do i have to search to find what an is made of (o and some chr(?)).