mail.app attachment as symbol or within mail

i have already place some system.debuglog-statements in there. but guess what…

Dim name as String

name = replaceAll(f.Name, "", "ae")

system.debuglog(name)

gives me exactly the name with umlaut .

When i use a fixed string in the replaceAll-statement it converts:

Dim name as String

name = replaceAll("some-nice-filename-with-", "", "ae")

system.debuglog(name)

gives me some-nice-filename-with-ae, which is what i need.

ReplaceAll knows nothing of the various possible ways a codepoint can be encoded into bytes. Especially with something like an umaut. It could be a combining pair or a single character. The name from the file system is encoding one way, the name from your string literal is using a different method. That’s one of the advantages of the new Text class. It works with the original codepoints, not some arbitrary byte representation.

So, MBS Plugins 15.0pr12 will encode attachment names.
Testing is welcome.

I hate it when something does not work as expected. It turns out for some reason Xojo displays characters that are composites of 3 characters representing each accented character. As a result, Instr() does not work with accented characters from the name of a file such as “a?e?e?c?a?u?a?e?i?o?u?.txt”. Looking at each byte in the string, is composed of “a”, then &uCC, then &u88.

This will remove all accents :

[code] dim f as folderitem = GetOpenFolderItem(“special/any”)

Dim rawname as String = f.name
dim name as string
for i as integer = 1 to len(rawname)
if asc(mid(rawname,i,1)) < 128 then
name = name+mid(rawname,i,1)
end if
next
msgbox name[/code]

I tried ReplaceAllB with “a”+&uCC+&u88, it simply does not work. Looks like a bug, but at this point, I am not ready to continue with this charade.

&uCC is probably not the same as CHR(&hCC). You want to use byte values, not codepoints.

I quickly looked at the content of the string, it is indeed something else.

What that says is that UTF-8 is not quite as convenient as I thought. It does not normalize code points. Neither do the Text type which I tried as well.

I am not going to lose sleep over that, though. Filenames with no accents are just fine to me.

Any reason you are not trying Beatrix’s suggestion?

http://documentation.xojo.com/index.php/EncodeURLComponent

attachment.name = EncodeURLComponent(f.Name)

[quote=169184:@Michel Bujardet]I quickly looked at the content of the string, it is indeed something else.

What that says is that UTF-8 is not quite as convenient as I thought. It does not normalize code points. Neither do the Text type which I tried as well.

I am not going to lose sleep over that, though. Filenames with no accents are just fine to me.[/quote]
I think you’re missing my point. Your use of &uXX is invalid in that context. The Text type should handle the filename, I would think, unless it doesn’t merge composite characters into a single codepoint. But it should.

[quote=169196:@Bob Coleman]Any reason you are not trying Beatrix’s suggestion?

http://documentation.xojo.com/index.php/EncodeURLComponent

attachment.name = EncodeURLComponent(f.Name)

This is likely the best solution, anyway.

That is the problem. It does not. I think it should.

That is the problem. It does not. I think it should. It needs a well prepared bug report.

I will look into the encodeURLComponent tomorrow. I am not too sure it will crunch the multibyte accented characters so well, though.

i tried the EncodeURLComponent with my attachment.Name:

[code] Dim dlg As New OpenDialog
Dim f As FolderItem

Dim result As  String

dlg.Title = "Bitte Anhang auswhlen..."
f = dlg.ShowModal

if f <> NIL then
  System.DebugLog("f.Name: " + f.Name)
  
  attachment.LoadFromFile(GetFolderItem(f.NativePath, FolderItem.PathTypeNative))
  
  attachment.Name = EncodeURLComponent(f.Name)
  System.DebugLog("f.Name: " + f.Name)       

mail.Attachments.Append(attachment)
end if[/code]

The system.debuglog before the EncodeURLComponent gives me this:

f.Name: geschftsverlauf-06-09-2014.pdf

and the system.debuglog after EncodeURLComponent gives me this:

attachment.Name: gescha%CC%88ftsverlauf-06-09-2014.pdf

What now? How do i make a usable attachment.Name out of this encoded string?

Michael

Btw: Regarding the replaceAll-Problem should i open a new topic to show in a simple code, that there seems to be a bug at all when using variables as OldString in this statement?

[quote=169263:@Michael Bzdega]i tried the EncodeURLComponent with my attachment.Name:

[code] Dim dlg As New OpenDialog
Dim f As FolderItem

Dim result As  String

dlg.Title = "Bitte Anhang auswählen..."
f = dlg.ShowModal

if f <> NIL then
  System.DebugLog("f.Name: " + f.Name)
  
  attachment.LoadFromFile(GetFolderItem(f.NativePath, FolderItem.PathTypeNative))
  
  attachment.Name = EncodeURLComponent(f.Name)
  System.DebugLog("f.Name: " + f.Name)       
  
  mail.Attachments.Append(attachment)
end if[/code][/quote]

Sorry there is an error in the code i posted above, but i cannot edit the post?

In the 2nd system.debuglog-statement it should be:

System.DebugLog("attachment.Name: " + attachment.Name)

The encodeURL is only half the rent. Have a look at the specification. You need to add something like " *=UTF-8’’ in front of the attachment name.

Sorry Beatrix, i tried to understand the RFC you mentioned and put this in my code:

      attachment.Name = EncodeURLComponent(f.Name)
      attachment.Name = "*=UTF-8" + attachment.Name

The result is attachment.Name: *=UTF-8gescha%CC%88ftsverlauf-06-09-2014.pdf and in my mail.app the name is also attachment.Name: *=UTF-8gescha%CC%88ftsverlauf-06-09-2014.pdf.

So, what is missing?

Yeah, the RFC is pretty suboptimal. As far as I can see you are missing the quotes around the name. Like in *=UTF-8"gescha%CC%88ftsverlauf-06-09-2014.pdf".

Added the quotes, but this also does not work:

attachment.Name = "*=UTF-8" + chr(34) + attachment.Name + chr(34)

I’ll have a look either in the afternoon or tomorrow.

When in doubt do reverse engineering. I sent myself a mail with the classical umlaut test “smörebröd”. Here is how this looks like in Mail.

–Apple-Mail=_105164BA-5719-4EE2-9FF4-E1877F128524
Content-Transfer-Encoding: base64
Content-Disposition: inline;
filename*=utf-8’'smo%CC%88rebro%CC%88d.png
Content-Type: image/png;
x-unix-mode=0644;
name="=?utf-8?Q?smo=CC=88rebro=CC=88d=2Epng?="
Content-Id: F2C3134C-317A-40C5-8723-3B4764949ED2@fritz.box

Then I added an attachment in Xojo with

[quote] dim tempFile as FolderItem = SpecialFolder.Desktop.Child(“smörebröd.zip”)
dim theAttachment as new EmailAttachment
theAttachment.LoadFromFile(tempFile)
theAttachment.Name = “=?utf-8?Q?smo=CC=88rebro=CC=88d=2Ezip?=”
MailToCompany.Attachments.Append theAttachment[/quote]

And it works. Now there are both Content Disposition and Content Type in Mail both containing the smörebröd with different encodings. The encoding from the Content Type is what is usually used for subjects.