Compressing Numeric Data in JSON String

Looking for suggestions.

I’m sending between two machines formatted as JSON strings. It’s working very well.
One of the elements I have to send frequently is arrays of numbers, which I encode into a single string value within the JSONItem. This can add up to become a big string of data in the end.

I’m trying out ways that I can compress or reduce the size of that data. Not that it’s essential that I do this (it seems to be working pretty good) but I’m thinking forward to the scalability, and efficiency.

I thought maybe creating a memory block, filling it with doubles or singles, converting to string - but that doesn’t save much (surprisingly) and then I have to encode it so that it doesn’t mess up the JSON string. Once I’ve done that it’s about the same size again.

Any thoughts or suggestions?

What kind of numbers (integers or doubles), and what are the potential ranges of each?

in code I am using doubles, but realistically they can be singles. Ranges should be more than +/- 256, ever.

Use ToString to limit or eliminate the decimals, but that’s all I can think of.

yeah, I thought of that too.
I was looking at compression too, but they all then require some string-friendly encoding after, which eliminates the benefit.

Maybe someone knows a way I’m unfamiliar with.

Thanks for the response, Kem.

Is there a high possibility of duplicate values?
If so, you can create an array of unique values, and an array of indices into that array

If singles will suffice, the raw data is 4 bytes, so little is gained by turning that into any other storage form.
How much compression you get using zip style compression of your original array will depend upon the size of the array, but coercing it from doubles to singles first will halve the storage at a stroke.
Making the binary array usable inside JSON would then involve using encodebase64 or similar, which increases the string size by about 12%

I’m going to keep toying with this.

Yeah, there is little to know chance of duplicate numbers.

I would suggest you generate test data set a bit larger than the largest you would ever expect to need and try sending that and looking at performance… If it is good enough, then I would not worry about any more optimization.

Karen

It depends on what you are optimizing for.

If network transmission time is of paramount concern, I suggest implementing socket-level compression via ZIP or similar. This could have the happy side effect of improving the performance of all of your JSON transmissions if you applied it more widely.

If encode/decode time or CPU usage is most important, you may want to reconsider using JSON in the first place and switch to a binary format that you could dump into a MemoryBlock and extract the values that way.

But most importantly, consider whether you should do anything at all (which it sounds like you have thought about already). If the existing system works correctly and is sufficiently performant with the current load AND with whatever load you can reasonably expect, then I’d leave it alone. You’d have a system that meets the requirements and does so via a human-readable format; that’s pretty valuable for documentation and troubleshooting.

3 Likes

It is way more about network transmission time.
I’m not having much impact with zip. I’ll try a bunch of methods.

If you are having to convert the ZIPped data into JSON-friendly text, you won’t see much compression at all. My suggestion was that the data that actually travels over the wire is binary ZIP data, not JSON. My hypothetical socket would accept any data for sending, compress it using ZIP, transfer it to the remote socket, where it would be uncompressed before being presented to the code at the other end.

1 Like