Issue with Using Pre-Allocated CVPixelBuffer for CoreML Model Prediction

Hello everyone,

I have a PyTorch model that outputs an image. I converted this model to CoreML using coremltools, and the resulting CoreML model can be used in my iOS project to perform inference using the MLModel's prediction function, which returns a result of type CVPixelBuffer.

I want to avoid allocating memory every time I call the prediction function. Instead, I would like to use a pre-allocated buffer. I noticed that MLModel provides an overloaded prediction function that accepts an MLPredictionOptions object. This object has an outputBackings member, which allows me to pass a pre-allocated CVPixelBuffer.

However, when I attempt to do this, I encounter the following error:

Copy from tensor to pixel buffer (pixel_format_type: BGRA, image_pixel_type: BGR8, component_dtype: INT, component_pack: FMT_32) is not supported.

Could someone point out what I might be doing wrong? How can I make MLModel use my pre-allocated CVPixelBuffer instead of creating a new one each time?

Here is the Python code I used to convert the PyTorch model to CoreML, where I specified the color_layout as coremltools.colorlayout.BGR:

def export_ml(model, resolution="640x360"):
    ml_path = f"model.mlpackage"

    print("exporting ml model")

    width, height = map(int, resolution.split('x'))
    img0 = torch.randn(1, 3, height, width)
    img1 = torch.randn(1, 3, height, width)

    traced_model = torch.jit.trace(model, (img0, img1))

    input_shape = ct.Shape(shape=(1, 3, height, width))
    output_type_img = ct.ImageType(name="out", scale=1.0, bias=[0, 0, 0], color_layout=ct.colorlayout.BGR)
    
    ml_model = ct.convert(
        traced_model,
        inputs=[input_type_img0, input_type_img1],
        outputs=[output_type_img]
    )
    
    ml_model.save(ml_path)

Here is the Swift code in my iOS project that calls the MLModel's prediction function:

func prediction(image1: CVPixelBuffer, image2: CVPixelBuffer, model: MLModel) -> CVPixelBuffer? {
    let options = MLPredictionOptions()
    guard let outputBuffer = outputBacking else {
        fatalError("Failed to create CVPixelBuffer.")
    }
    options.outputBackings = ["out": outputBuffer]
    
    // Perform the prediction
    guard let prediction = try? model.prediction(from: RifeInput(img0: image1, img1: image2), options: options) else {
        Log.i("Failed to perform prediction")
        return nil
    }

    // Extract the result
    guard let cvPixelBuffer = prediction.featureValue(for: "out")?.imageBufferValue else {
        Log.i("Failed to get results from the model")
        return nil
    }

    return cvPixelBuffer
}

Here is the code I used to create the outputBacking:

let attributes: [String: Any] = [
    kCVPixelBufferCGImageCompatibilityKey as String: true,
    kCVPixelBufferCGBitmapContextCompatibilityKey as String: true,
    kCVPixelBufferWidthKey as String: Int(640),
    kCVPixelBufferHeightKey as String: Int(360),
    kCVPixelBufferIOSurfacePropertiesKey as String: [:]
]

let status = CVPixelBufferCreate(kCFAllocatorDefault, 640, 360, kCVPixelFormatType_32BGRA, attributes as CFDictionary, &outputBacking)

guard let outputBuffer = outputBacking else {
    fatalError("Failed to create CVPixelBuffer.")
}

Any help or guidance would be greatly appreciated!

Thank you!

May I ask if my description was not clear enough?

Let me add some details:

I extracted the model output CVPixelBuffer from the result returned by the MLModel.prediction() function and passed it as a cached outputBuffer to the outputBackings of MLPredictionOptions, like this: options.outputBackings = ["out": outputBuffer].

However, I still encountered the error: "Copy from tensor to pixel buffer (pixel_format_type: BGRA, image_pixel_type: BGR8, component_dtype: INT, component_pack: FMT_32) is not supported."

Issue with Using Pre-Allocated CVPixelBuffer for CoreML Model Prediction
 
 
Q