Shell and Text Encodings(RB)

I am trying to use shell commands to access the 7zip command line interface. For example, I wish to list the contents of a .7z file which has a Japanese name.

  1. Using the file name in RB on the command line results in a “file not found” on the shell execute.
  2. Manually starting cmd, issuing “chcp 65001” then pasting the 7zip command line ex RB works.
  3. Altering the RB shell command to have 2 parts - chcp 65001 plus “&” plus the 7zip command does not work.
  4. Trying to put this in a batch file does not work - batch files saved in UTF-8 just crack up, closing cmd even if run from cmd.

Can anyone suggest an answer? Thanks.

( Win7 )

Would that have anything to do with this https://forum.xojo.com/22402-unicode-folderitem-shellpath

The issue looks somewhat similar.

Unfortunately there were never any definite solution there, but some of the leads could possibly be of use.

I renamed a log file ??.log hopping that my US English Windows 10 would give me a Dos name with DIR /x but no joy. Somehow all I get is ??.LOG and no short name. So we seem to be stuck with the ideograph name.

I was able to rename the file with ren ??.log two.log but it works only if only one file exist with a two characters name.

I also checked that Xojo was abe to locate the file as FolderItem and rename it with

dim f as FolderItem = SpecialFolder.Desktop.child("??.log") f.name = "othername.log"

Maybe that can be your solution.

One way could be to use that :

Text = DecodeURLComponent(f.URLPath)

It produces a somewhat an Ascii equivalent of the ?? that could be used to rename the file to :
file:///C:/Users/mitch/Desktop/桌面.log

Fact is, if the RB/Xojo shell apparently does not support UTF-16, nothing says 7z command line does either. The Windows Command Prompt does not.

Thank you very much, Michel - I appreciate your time and ideas.

I think it is the same issue as the Chinese one. It is NOT a RB/Xojo issue but a Windows one, as I can access these files perfectly fine, correct names and all.

I am going to look at your last suggestion - have not had time yet.

The rename option will work - renaming to something in ASCII. What I have done for now is to copy the file using RB .copyfileto, but that is not ideal and the rename option is better.

Thanks again.

Thanks again, Michel, I have gone with the rename, and re-rename back again after processing, and it is satisfactory. I am not going to risk the last idea for fear of ending up with some really funny names.

Now what to do to extract archives with Shift-JIS names inside, and you don’t know which archives are like that. :wink:

( I’ll leave that to running the 7Zip gui under applocale, which does work ).

[quote=190217:@Peter Job]Thanks again, Michel, I have gone with the rename, and re-rename back again after processing, and it is satisfactory. I am not going to risk the last idea for fear of ending up with some really funny names.

Now what to do to extract archives with Shift-JIS names inside, and you don’t know which archives are like that. :wink:

( I’ll leave that to running the 7Zip gui under applocale, which does work ).[/quote]

I compacted an archive with my ??.log inside, no problem. Then extracting all produces a ??.log back.

Since there seems to be no way to enter UTF-16 in the Command Prompt, I would not know how to unpack that one file specifically, though.

[quote=190225:@Michel Bujardet]I compacted an archive with my ??.log inside, no problem. Then extracting all produces a ??.log back.

Since there seems to be no way to enter UTF-16 in the Command Prompt, I would not know how to unpack that one file specifically, though.[/quote]

I am winning with the unpacking, using your rename idea. Once you can get 7-zip going on a file, it has no problem with UTF-8 names within the archive. It is just the actual archive name in a shell command that fails.

I see there is a good scheme for packing files with UTF-8 names into an archive ( archive better have ASCII name, or rename once done ). 7-zip allows you to make a list of files in a .txt file, and says it expects that text file to be in UTF-8!

http://sevenzip.osdn.jp/chm/cmdline/syntax.htm

[quote]You can supply one or more filenames or wildcards for special list files (files containing lists of files). The filenames in such list file must be separated by new line symbol(s).

For list files, 7-Zip uses UTF-8 encoding by default.[/quote]

I will test.

( I like 7-zip because it can pack/unpack several formats, including 7z, zip, rar, lzh ).

Japanese Kanji is UTF-16, though. But if it works fine in the list, you’re set :slight_smile: