# How to Colorize Data?

Greetings -

I have a requirement to “colorize” a large amount of data (maybe 10E6 to 10E9 points) for display on a canvas. At each point in the data (effectively a 2D array, with a value for every index pair in the array), I need to assign a color, depending on the numeric value at that point. An example of a similar display can be seen at: http://hfradio.org.uk/cap64.jpg In this example, the horizontal axis is frequency (as on a radio dial), the vertical axis is time, with the newest at the top; it constantly rolls downward as new data is added, line by line at the top giving the display its colloquial name “waterfall”.

My big question is how to map a numeric range into a color? For example, lets say that N1 and N2 are end points of the numeric range and I want N1 mapped to RGB(255,0,0), N2 mapped to RGB(0,0,255), and (N1+N2)/2 mapped to RGB(0,255,0), how can I write a mapping function that would return a color for intermediate values? I don’t care, greatly, what the actual function is, as long as it does not produce visual discontinuities and does not have a large computational burden.

Your ideas and suggestions are genuinely appreciated.

Many thanks,
Jim Wagner
Oregon Research Electronics

Sound like you need a GRADIENT function… a simple version (off top of my head)

``````function getColor(color1 as color,color2 as color,n1 as integer,n2 as integer,NP as integer) as color
dim pct as double
dim newRed as integer
dim newGrn as integer
dim newBlue as integer
pct=(NP-min(n1,n2))/max(n1,n2)
newRed=(abs(color1.red-color2.red)*pct)+min(color1.red,color2.red)
newGrn=(abs(color1.green-color2.blue)*pct)+min(color1.green,color2.green)
newBlu=(abs(color1.blue-color2.blue)*pct)+min(color1.blue,color2.blue)
return rgb(newRed,newGrn,newBlu)
end function``````

NOTE : This is to demonstate the math and method, NOT to be a fast solution!!
With millions of points, you will need to optimize what ever method you use,

If the start and end colors will always be the same, you can built a precalculated array of the color transistions instead of calculating them over and over again

Thanks, Dave. I think that I see what you are doing. I like the idea of a precalculated array (a lookup table). Then, it would be a much simpler matter of mapping one numeric interval (data) into an integer interval (the set of lookup table indices). That should be pretty fast.

Jim

A dictionary might be faster for the lookup table.

Thanks, Markus -

You are probably correct. Appreciate the hint!

Jim

10E9 ?
You dont have enough pixels

You’re likely to only display some subset of that many data points at best

Something to keep in mind…

The largest image that a picture can represent is 1.024E9, so you’ll likely have to draw the data on the fly.

Thanks Norman and Greg -

I am just now having to cope with that detail. Appreciate the heads-up! 60 days of data at 10 samples per second gives lots of “stuff” to deal with!

Jim

10 per sec
60 secs per min
60 mins per hours
24 hours per day
60 days

right ?
This is only 51 million and some odd points

Still more pixels than a 2880 x 1800 screen has though

that is assuming that the measurement tolerace is fine enough that the pixel coordinate rounding results in 51million unique coordinates… Statistically, its safe to say that out of 51 million samplings, rounded to the resolution of even the largest screen (while still quite large) will be smaller than the number of original samples

I think that I can cut the number by 2 or maybe even 4 (by averaging blocks of samples). The sample rate is pretty accurate.
The data will be displayed in horizontal lines of 1 hour duration (36000 points without averaging)). The plotted data is really an FFT of an hour’s data at 10Hz. The real part of the result contains N/2 points, ditto the imaginary part. I determine the magnitude of each bin, so there are N/2 points for each hour’s worth of data. The point is to compare the spectrum of successive hourly data over time.

Thanks for the insight,
Jim

If it doesn’t and you get a lot of overlap then you waste a lot of time calculating data to plot right on the same pixel as already drawn

Just an observation that sometimes you have way more data than you can squeeze into a discrete number of pixels

remember the resolution of your measure is x1 to x2 and y1 to y2
but assume for arguments sake you using a 2560 x 1440 monitor (Wx,Wy)

given a measure point at (Xm,Ym) where X1<=Xm>X2 and Y1<=Ym>=Y2 then the PIXEL point will be

``````xp=(Xm-X1)/(X2-X1))*Wx
yp=(Ym-Y1)/(Y2-Y1))*Wy``````

Your lose of resolution will be Wx/(X2-X1)
so for example if (X2-X1)= 1,000,000
and Wx=2560 then you are seeing 390 real world datapoints condensed to ONE Pixel

So I would NOT attempt to draw everything from the real data.

• Convert your real world measurments to graphics measurements (similar to above) [These will be INTEGER results]
• remove all duplicate coordinates
• plot what ever is left (this will by default, LESS than the # of pixels on the screen, as in a worst case scenario you will plot 0.25% of the real data

Really appreciate both sets of suggestions. I am right at the point where this is pertinent!

Actually, I think that averaging blocks of 2 or 4 spectrum bins might work. Right now, my frequency resolution is 0.3mHz and I can easily tolerate 2-4mHz. Of course, averaging several time-domain samples WILL be more efficient., But, I need to compare the results of the two methods and which appears visually more useful (if they are different).

As I think about it, averaging by N in the time domain reduces the effective sample rate by 1/N and that reduces the Nyquist Frequency by 1/N and that reduces the maximum resolvable frequency by 1/N. If I average by N in the frequency domain, it retains the maximum resolvable frequency but increases bin width (minimum resolvable frequency) by N.

So, I don’t think that the two are exactly equivalent. Will have to experiment and see which is acceptable.

Jim

By the way, back on the topic of colorization, I tried a simple triangular mask for each color and that resulted in really muddy yellow and blue-green. It worked much better if I did the mask like this. Suppose that 0 corresponds to “pure blue” and N corresponds to “pure red”. Then, for 0 <= n <= N (hope I have these ordered correctly),

0 <= n <= N/4 -> Blue = 255, Green = 255n/(N/4), Red = 0
N/4 <= n <= N/2 -> Blue = 255
(N/2 - n)/N/4, Green = 255, Red = 0
N/2 <= n <= 3N/4 -> Blue = 0, Green = 255, Red = 255*(n-N/2)/N/4sss
3N/4 <= n <= N -> Blue = 0, Green = 255*(N - n)/N/4, Red = 255

This results in trapezoidal masks centered about each color. The flat “top” of the trapezoid has a half-width of N/4 centered about 0, N/2, and N, and a sloping over an interval of N/4.

Now, the Yellow and Blue-Green (on a monitor) really pop out and it looks MUCH nicer. I’d be happy to provide the algorithm that produces a nice lookup table if anyone wants it.

Jim