Core Image Tiling and ROI

Question

Created Oct ’24

Replies 3

Boosts 0

Participants 2

Our application uses Core Image to apply custom CIFilters to still images and video. I'm running into issues when the supplied image is large enough (>4096) that the image is automatically tiled. The simplest of these to describe is a filter that performs various mirroring effects - backwards, upside-down etc.

The implementation portion of the filter provides a sampler (src) and passes this into the kernel with an roiCallback that uses the destRect, inset by -1 in both dimensions:

return [mirrorsKernel applyWithExtent:[src extent]                                    roiCallback:^CGRect(int index, CGRect destRect) { return CGRectInset(destRect, -1, -1); }
arguments:@[src]
];

The kernel is very simple, sampling from the X coordinate equal to the src width - current coordinate:

float4 backwards(sampler image, destination dest)
{
    float2 dc = dest.coord();
    dc.x = image.size().x - dc.x;            
    return image.sample(image.transform(dc)));
}

When this runs on an image that is wider than 4096, tiling happens, with the result being that destRect is not the entire image and therefore the resulting output image is incorrect. If the ROI uses [src extent] instead of destRect, the result is correct, but this will lead to serious performance issues when src gets too large.

All of this makes sense to me. What I'd like to know is if there is a way to handle this filter's requirements for sampling from the entire source while still limiting the ROI to maintain performance? I think the answer is probably no within our current structure and performance limits. But I wanted to see if there's anything we're missing.

I am aware that the simple kernel above can be replaced with an affine transform, which is an option for backwards and upside-down mirroring. We have other kernels in this filter that perform mirroring of either half of the source image or one quadrant of the source image. In these cases, I suppose it might be possible (up to a point) to create a custom ROI that is only the portion of the source that is being mirrored. We have not attempted that yet.

Any thoughts/input appreciated, thanks!

Boost

Answer 1

mcorke OP

Oct ’24

Update: I have managed to replicate the functionality of our existing custom mirror kernels using cropping, affine transform, and compositing operations. Seems to work so far at least, with large (> 4096) still images that would otherwise get tiled. There's definitely a noticeable performance hit (temporary) with the still image. I haven't tested with a movie yet, but I'm assuming that will also suffer, perhaps constantly with each new frame. Therefore, I'd still be interested to know if there's a way to keep the more-optimized kernel approach with the larger images/movies if at all possible.

0

Answer 2

FrankSchlegel OP

4w

I recommend checking out the old Core Image Programming Guide on how to supply a ROI function.

Basically, you are given a rect in the target image (that your filter should produce) and are asked what part of the input image your filter needs to produce it. For small images, this is usually not relevant because Core Image processes the whole image in one go. For large images, however, CI applies tiling, i.e., processing slices of the image in sequence and stitching them together in the end. For this, the ROI is very important.

In your mirroring example, the first tile might be the left side of the image and the second tile the right side. When your ROI is asked what part of the input is needed to produce the left side of the result, you need to return the right side of the input image because it's mirrored along the x-axis, and vise versa. So you basically have to apply the same x-mirroring trick you use when sampling to mirror the rect in your ROI callback.

0

Answer 3

mcorke OP

3w

Thanks for your reply. I have indeed reviewed the programming guide (many times over the years), and I'm aware of how to provide an ROI function. The guide could really do with an update (and some fixes for errors) and better examples with more detailed explanation. So much of this stuff we've had to figure out by trial and error.

We've been able to get away most of the time without supplying much in the way of custom ROIs for a long time, just returning the destRect or destRect inset by -1. But we do have some custom ROIs when using more than one sampler where the two samplers may be different sizes.

After first reading your suggestion to reduce the ROI to only the portion being mirrored, I could see how it would be possible to say that some mirroring actions would require only some smaller portion of the input image (with limited ROI) and that this could help in some cases, but I figured this would only help in those cases where the entire image was not needed (the full image mirror flips) and also where the ROI itself would not exceed the 4096 pixel limit.

I've edited our code to try this out and the results are good up to a point. I've kept the method of using the affine transform for the full image mirror flips, so the following comments only relate to the mirroring of a portion of the source image - either half or a quarter.

I've taken your suggestion and created a custom ROI that is the portion of the source image that is being mirrored - left half, top half, bottom-right quarter, etc. This works fine until the source texture gets too big, and logically, the point at which it gets too big depends on how large the ROI needs to be. e.g. the ROI for the left to right mirror is larger than the ROI for a bottom-right quarter mirror and thus the quarter mirror can handle a larger source image. At the point where the source texture is too large, what happens is that the entire rendering loop (running at ~60Hz) of our app stalls. I'm assuming this is because the texture being passed into the CI filter chain is so large that it doesn't return fast enough for our rendering to complete in time and this just snowballs. Because of this, I've made to decision to use the affine transforms instead of a kernel with custom ROI whenever the source image size is > 8192 in either dimension.

But I do want to double-check my assumptions of what is happening regarding the use of "destRect".

I know the clamping with a larger image with our original code is because the tiling means that the entire source is not available for each "pass" through the kernel. And that supplying a custom ROI means that the correct portion of the source IS available. I just want to check my understanding that when you use "destRect" in the ROI you're always going to get a tiled rect, assuming the image is large enough to cause tiling? I'd have to say, embarrassingly, that when I converted all of our effects from using old methods for setting the ROI (using setROISelector) and started using applyWithExtent, I found examples somewhere that used destRect, and I followed along, clearly not fully appreciating what impact it could have. It would appear that in some cases this isn't right at all, and we want to use the entire source (CGRectInfite I guess) or like with this mirroring effect, the portion of the image that is being mirrored.

0