So I’ve got a problem (okay more than one) but this one is driving me nuts.
I’m wrapping up a Core Image based application, where all the complicated image processing is done in Core Image (it’s for processing HDR images), I have a bunch of custom filters that do the math which the built-in filters cannot do.
- On a 2012 rMBP running El Cap, it processes a 5472 x 2998 image in 4 seconds (was 6 but I’ve shaved 2 off).
- On a 2015 MacBook running Yosemite, the same image and same settings takes 7 minutes (was 8, but I’ve shaved one off).
- On the same 2015, but running Sierra; same image and settings, takes 24 seconds.
- On a 2014 MBA running Yosemite, same image and settings, it took 30 minutes.
- Same 2014 MBA running El Cap, also 30 minutes.
If I simply export the image with no processing, it takes 2~3 seconds across the range of hardware and OS versions. The application is 64-Bit built with Xojo 2016r4.
I’ve tried all tests with and without GPU acceleration, the 2012 rMBP actually takes 23 seconds when using the dGPU (but that’s understandable as the image is larger than the 1 GB of memory the GPU has, it’s a 32-Bit per channel image).
Seems like for non-high end GPUs, Core Image is incapable of complicated tasks on larger resolution images.
Any suggestions, I’m not liking the idea of dumping Core Image and rewriting everything in C, but it seems like it might be the only way to get consistent results.
Oh, and drawing a small version to screen, works exactly how you’d expect and is mighty fast.
Do all if these have the same amount of RAM?
Are you sure that they were all set to use the high performance gpu?
Varying levels of RAM:
- 2012 has 16GB
- 2014 has 4GB
- 2015 has 8GB
For all them I do most of the export with the dGPU off; only the 2012 has a dGPU. By creating a CIContext with useSoftware YES.
I think I may have narrowed it down to the use of the mix function in my custom kernels. I will test that soon by using a different function to blend pixel values.
How much RAM does your image processing need? The machine that takes so long has only 4 GB.
And is only a MacBook Air - not exactly a speed demon in terms of CPU of GPU
I’ve watched the processing take up to 1.2 GB of ram with this image, which according to activity monitor doesn’t push the machine into using swap.
Oops… Just tried hardware acceleration on the MBA and the machine froze…
On the 2015 rMB, it takes 8 minutes on Yosemite, but 25 seconds on Sierra…
Indeed it’s not, in day to day tasks it’s faster than the 2015 rMB, but to take 30 minutes to do a task, which can be done in 6 seconds on a 2012 is embarrassing.
I’m thinking I may have to break the task up, apply some rendering, relax, apply some more, rinse and repeat until it’s done.[quote=308729:@Sam Rowlands]I think I may have narrowed it down to the use of the mix function in my custom kernels. I will test that soon by using a different function to blend pixel values.[/quote]
It wasn’t this
Do you want me to do some testing on my 2012 Air? Has also 4 GB but runs on Sierra.
I’ve had a customer reporting that scrolling on a Sierra machine was very slow for a Einhugur DataGrid. Runs quite fine here on the Air. So very strange…
I think I may have it…
I was thinking about while watching the memory usage jump all over the place; and I thought I wonder if the rendering chain is simply too complicated for Core Image on these machines.
So I broke the chain down into sections, then used a CIImageAccumulator to do some intermediate renders…
- 2015 rMB, Yosemite, software only: 16 seconds. With GPU: 8 seconds. Down from: 25 seconds or 8 minutes.
- 2012 rMBP El Cap, software only: 3.3 seconds. With GPU: 2.8 seconds. Down from: 4 seconds.
- 2014 MBA, Yosemite, software only: 15.79 seconds. With GPU: 5.45 seconds. Down from: 30 minutes!
That’s some performance boost huh? Now I’ll set about undoing some of my other experimental changes.
Addendum; for those who might run into this situation and use this technique to solve it.
Make sure that you initialize the CIImageAccumulator with
imageAccumulatorWithExtent:format:colorSpace: otherwise the CIImageAccumulator will convert the color profile of the image to sRGB, every time you use it, even if the image colorSpace is already sRGB.
There’s either bug or incorrect documentation as trying to get the colorSpace from a CIImage on Yosemite fails, so you need to extract it from the CGImage to which you created the CIImage from.
I personally have decided to set the entire workflow to sRGB, so I use a CGBitmapContext to ensure that the pixel data is created using this colorSpace and also to use ARGB for onscreen display (gave me a couple more FPS).
Just updating this for those who come to it in the future.
I tried to process a 30 megapixel image, fine. 50 megapixel image crashed the graphic card and made the machine go wonky. Even disabling dedicated GPU didn’t stop it from crashing.
Instead of using a CIImageAccumulator; create CGBitmapContext and do the intermediate rendering to that instead, then recreate the CIImage from that context.
I was able to process the 50 megapixel image, and even a 101 megapixel image (took about a minute for 101 mgp image).
So rule of thumb, Core Image is fricking awesome for iPhone photos, for everything else you need some cunningness to make it work.
In the future I’m going to replace Core Image with 100% pure my own code as it’s simply too unreliable.
Further update some of the older hardware running 10.10 will crash hard, updating them to El Capitan solves that issue.