And here's version 5 which is 2x as fast: now we are at 54msec for 2MB, a speedup of over 35x from the original. In this version we just use the plain XOR operator which (I think) does the right then when using integer operands:
Function rc4v5(strData as string, strKey as String) As string #Pragma DisableBackgroundTasks #Pragma DisableBoundsChecking #Pragma NilObjectChecking False Dim MM as MemoryBlock = strData Dim MM2 as New MemoryBlock(mm.Size) dim mmKey as MemoryBlock = strKey dim memAsciiArray as new MemoryBlock(256) dim memKeyArray as new MemoryBlock(256) dim memJump as integer dim memTemp as integer dim memY as integer dim intKeyLength as integer dim intIndex as integer dim intT as integer dim intX as integer intKeyLength = lenb(strKey) dim pK as ptr =memKeyArray dim pKey as ptr = mmKey for intIndex = 0 to 255 pK.byte(intIndex) = pKey.byte( intIndex mod intKeyLength ) next dim pA as ptr = memAsciiArray for intIndex = 0 to 255 pA.byte(intIndex) = intIndex next for intIndex = 0 to 255 memJump = (memJump + pA.byte(intIndex) + pK.byte(intIndex)) mod 256 memTemp = pA.byte(intIndex) pA.byte(intIndex) = pA.byte(memJump) pA.byte(memJump) = memTemp next intIndex = 0 memJump = 0 dim p2 as ptr = mm2 dim mm2size as integer = mm2.size for intX = 1 to mm2size intIndex = intIndex + 1 if intIndex > 255 then intIndex = 0 end if memJump = memJump + pA.byte(intIndex) if memJump > 255 then memJump = memJump - 256 end if intT = pA.byte(intIndex) + pA.byte(memJump) if intT>255 then intT = intT -256 end if memTemp = pA.byte(intIndex) pA.byte(intIndex) = pA.byte(memJump) pA.byte(memJump) = memTemp memY = pA.byte(intT) //mm2.Byte(intX - 1) = bitwise.bitxor(val("&h" + hex(MM.byte(IntX - 1))), bitwise.bitxor(memTemp,memY)) //mm2.byte(intX-1) = Bitwise.BitXor(mm.byte(intX-1), memTemp, memY) //p2.byte(intX-1) = Bitwise.BitXor(p2.byte(intX-1), memTemp, memY) p2.byte(intX-1) = p2.byte(intX-1) Xor memTemp Xor memY next return MM2 End Function