File URLs with Unicode using FolderItem.URLPath

I have a Xojo folderItem, and I want to pass it’s URL (which is of the file:// scheme) to a HTMLViewer. I works fine with the path name is all ASCII but when Unicode is involved, it fails. It turns out that Internet Explorer is “special” in how it handles File URLs:

See html - Unicode characters in a URL (all ok - except for IE) - Stack Overflow

So the question is, how do I handle this in Xojo?

In Xojo FolderItem.URLPath will return a path that is fully percent-encoded, e.g.

file:///C:/Users/user/Desktop/folder%20with%20%C3%BCmlaut/bbb_%C3%BCmlaut_filename_1080p.m4v

but what I need is

file:///C:/Users/user/Desktop/folder%20with%20ümlaut/bbb_ümlaut_filename_1080p.m4v

If I try

DecodeURLComponent(url, Encodings.UTF8) 

I don’t get the right answer, as spaces (and probably other characters) are no longer percent-encoded, and I get something like this which does not work:

file:///C:/Users/user/Desktop/folder with ümlaut/bbb_ümlaut_filename_1080p.m4v

(notice the spaces in the folder name are no longer %20).

In other words, it looks like I need to convert some of the %-encoded entities but leave others.

This opens perfectly in Internet Explorer :

[code] dim f as FolderItem = GetOpenFolderItem(“special/any”)

	dim explore as FolderItem = GetFolderItem("C:\\Program Files\\Internet Explorer\\iexplore.exe")
	
	explore.Launch(f.ShellPath)
	[/code]

Note : the URL of the file I point to with f is C:\\Users\\mitch\\Desktop\\folder with ümlaut\\robot.jpg

Hi Michel,

It’s not opening a folderItem from Xojo within IE that is the issue, it’s having a file:// URL within HTML which is parsed by IE (or used within a HTMLViewer) that is the problem.

In other words:

<!DOCTYPE html>
<html>
<video src="file:///C:/Users/user/Desktop/folder with ümlaut/bbb_ümlaut_filename_1080p.m4v">
</video>
</html>

does not work, while

<!DOCTYPE html>
<html>
<video src="file:///C:/Users/user/Desktop/folder%20with%20ümlaut/bbb_ümlaut_filename_1080p.m4v">
</video>
</html>

does work.

I’ve come up with a hack which seems to work - of course I would be careful with this:

Function URLPathForInternetExplorer(extends f as FolderItem) As string
  #Pragma DisableBackgroundTasks
  
  // see https://forum.xojo.com/37623-file-urls-with-unicode-using-folderitem-urlpath
  if f = nil then
    return ""
  end if
  
  dim u1 as string = f.URLPath
  
  // algorithm: any %XX encoded entities which are high bytes (e.g. 80 - FF) will get changed back into UTF8
  // everything else stays
  
  
  dim mb as MemoryBlock = u1
  dim i as integer = 0
  dim u as integer = mb.Size-1
  
  dim result as string
  
  while i <= u
    dim c as string = mb.stringValue(i,1)
    c = c.DefineEncoding(Encodings.UTF8)
    
    if c = "%" then
      ' this is the start of a %XX encoded sequence
      ' look ahead 2 letters
      
      dim HH as string = mb.StringValue(i+1,2)
      dim h as integer = val("&h" + HH)
      
      if h >= &h80 then
        ' this is a percent-encoded Unicode-sequence, turn it back into UTF8
        result=result+ChrB(h)
        
      else
        ' this is a percent-encoded entity that should stay percent encoded
        result=result + "%" + HH
        
      end if
      
      ' jump ahead 3
      i = i + 3
    else
      
      // normal character, just add it to result
      result = result + c
      i = i + 1
    end if
    
    
  wend
  
  result = result.DefineEncoding(Encodings.UTF8)
  
  return result
End Function

You should have said that from the start…