I have a string that I need to decode and it’s in the following format:
“\x7b\x22responseContext\x22:\x7b\x22serviceTrackingParams\x22:\x5b\x7b\x22service\x22:\x22CSI\x22,\x22params\x22:\x5b\x7b\x22key\x22:\x22c\x22,\x22value\x22:\x22WEB_REMIX\x22\x7d,\x7b\x22key\x22:\x22cver\x22,\x22value\x22:\x221.20230814.01.00\x22\x7d…”
The two digits following \x are hex codes that represent the ascii character.
I wrote something to do it just with regular xojo string parsing and it works but it’s insanely slow when you have megabytes and megabytes and megabytes to process.
Thanks in advance.
Assuming you know nothing about the source other than ascii codes, I’d use MemoryBlock. Especially if you use Ptrs, that should be very fast.
Let me see if I can work up the code…
Isn’t this basically percent encoding with \x
instead of %
?
Dim s As String = encodedstring
s = ReplaceAll(s, "\x", "%")
s = DecodeURLComponent(s)
2 Likes
Note, minimally tested, but it takes less than 4s for about 25 MB once compiled:
Private Function ToValue(hexByte As Integer) As Integer
const kZero as integer = 48
const kUpperA as integer = 65
const kLowerA as integer = 97
select case hexByte
case is >= kLowerA
return hexByte - kLowerA + 10
case is >= kUpperA
return hexByte - kUpperA + 10
case else
return hexByte - kZero
end select
End Function
Public Function Decode(source As String) As String
var sourceMb as MemoryBlock = source
var destMb as new MemoryBlock( sourceMb.Size )
var sourcePtr as ptr = sourceMb
var destPtr as ptr = destMb
var sourceIndex as integer = 0
var destIndex as integer = -1
var goneTooFarIndex as integer = sourceMb.Size - 3
const kBackslash as integer = 92
const kX as integer = 120
while sourceIndex < sourceMb.Size
var thisByte as integer = sourcePtr.Byte( sourceIndex )
if thisByte = kBackslash and sourceIndex < goneTooFarIndex then
var nextByte as integer = sourcePtr.Byte( sourceIndex + 1 )
if nextByte = kX then
thisByte = ToValue( sourcePtr.Byte( sourceIndex + 2 ) ) * 16 + _
ToValue( sourcePtr.Byte( sourceIndex + 3 ) )
sourceIndex = sourceIndex + 3
else
//
// It's something else
//
thisByte = nextByte
sourceIndex = sourceIndex + 1
end if
end if
destIndex = destIndex + 1
destPtr.Byte( destIndex ) = thisByte
sourceIndex = sourceIndex + 1
wend
return destMb.StringValue(0, destIndex, Encodings.UTF8)
End Function
2 Likes
Unless the source string might contain \\x
. Otherwise, your plan takes 0.5s here compiled.
Public Function Decode2(s As String) As String
return DecodeURLComponent( s.ReplaceAll( "\x", "%" ) )
End Function
1 Like
BUT if I add pragmas, mine takes 0.2 s. So there. 
To the top of each method in my first post:
#if not DebugBuild
#pragma BackgroundTasks false
#pragma BoundsChecking false
#pragma NilObjectChecking false
#pragma StackOverflowChecking false
#endif
To avoid improper percent decoding, one need to encode those found in the source string first too, before the “URL decoding”, as:
Return DecodeURLComponent( s.ReplaceAll("%","%25").ReplaceAll( "\x", "%" ) )
2 Likes
@Kem_Tekinay, @Andrew_Lambert and @Rick_Araujo, I can’t thank you enough. These are all great approaches that work orders of magnitude faster than what I was doing. Kem, that is a truly magnificent piece of code, thanks for writing that up. Andrew and Rick, you’re spot on about the simplicity of the problem that was staring me in the face!
Thanks everyone, I really appreciate it.
2 Likes