UNescaping a string

\\xE2\\x80\\xA6 = … (alt+dot)
\\x2C = ,
\
 = Line break
"you\\'ve" = \\ Shouldn't be there.

anyone have a simple “unescape” procedure? above just being examples :slight_smile:

I need to accept a string from a file , and convert it to an actual displayable value
Normal items are simple… but not sure about the hex items (if encoding etc becomes an issue

The assumption is that an actual backslash will be entered as “\”?

I’d use regular expressions to find and replace the hex characters, then ReplaceAll to replace the known substitutions like "
". Finally, I’d go back to regular expressions to find and remove the slashes.

Why regex? Because if you just replace “” with nothing, you will end up getting rid of “\” or “” at the end of the document. The pattern “\\(.)” with a replacement pattern of “$1” will do this correctly.

BTW, the hex pattern in your example is UTF-8, so you can just DefineEncoding when you’re done.

well I replaced the \X## sequences with the decimal values, converted to UTF8 encoding… and got 3 characters of gibberish

REGex is great… if you are a REGex expert… I can never get my head around it… so I usually avoid it as it violates (for me) by 1st commandment (code must be readable).

so here is what I thought would work (and does for all but the \X stuff)

  Dim t As String
  Dim x As Integer
  dim c as string
  s="\\xE2\\x80\\xA6"
  x=InStr(s,"\")
  t=s
  If x>0 Then 
    t=Left(s,x-1)
    While x<Len(s)
      c=Mid(s,x,1)
      x=x+1
      If c="\" Then 
        c=Mid(s,x,1)
        x=x+1
        Select Case c
        Case "'",ChrB(34),"\"
          t=t+c
        Case "x" // hex
          If x<=Len(s)- 1 Then 
            c=mid(s,x,1)
            x=x+1
            c=c+mid(s,x,1)
            x=x+1
            //t=t+Encodings.UTF8.Chr(Val("&H"+c))
            t=t+ChrB(Val("&H"+c))
          End If
        end select
      Else
        t=t+c
      End If
    Wend
    t=t.ConvertEncoding(encodings.UTF8)
    msgbox t+"   ="+s
  End If
  Return t

You need DefineEncoding, not ConvertEncoding.

BINGO! Thanks… that did it :slight_smile: