\\xE2\\x80\\xA6 = (alt+dot)
\\x2C = ,
\
= Line break
"you\\'ve" = \\ Shouldn't be there.
anyone have a simple “unescape” procedure? above just being examples
I need to accept a string from a file , and convert it to an actual displayable value
Normal items are simple… but not sure about the hex items (if encoding etc becomes an issue
The assumption is that an actual backslash will be entered as “\”?
I’d use regular expressions to find and replace the hex characters, then ReplaceAll to replace the known substitutions like "
". Finally, I’d go back to regular expressions to find and remove the slashes.
Why regex? Because if you just replace “” with nothing, you will end up getting rid of “\” or “” at the end of the document. The pattern “\\(.)” with a replacement pattern of “$1” will do this correctly.
well I replaced the \X## sequences with the decimal values, converted to UTF8 encoding… and got 3 characters of gibberish
REGex is great… if you are a REGex expert… I can never get my head around it… so I usually avoid it as it violates (for me) by 1st commandment (code must be readable).
so here is what I thought would work (and does for all but the \X stuff)
Dim t As String
Dim x As Integer
dim c as string
s="\\xE2\\x80\\xA6"
x=InStr(s,"\")
t=s
If x>0 Then
t=Left(s,x-1)
While x<Len(s)
c=Mid(s,x,1)
x=x+1
If c="\" Then
c=Mid(s,x,1)
x=x+1
Select Case c
Case "'",ChrB(34),"\"
t=t+c
Case "x" // hex
If x<=Len(s)- 1 Then
c=mid(s,x,1)
x=x+1
c=c+mid(s,x,1)
x=x+1
//t=t+Encodings.UTF8.Chr(Val("&H"+c))
t=t+ChrB(Val("&H"+c))
End If
end select
Else
t=t+c
End If
Wend
t=t.ConvertEncoding(encodings.UTF8)
msgbox t+" ="+s
End If
Return t