CoreML - doUnloadModel:options:qos:error

I have a model that uses a CoreML delegate, and I’m getting the following warning whenever I set the model to nil. My understanding is that CoreML is creating a cache in the app’s storage but is having issues clearing it. As a result, the app’s storage usage increases every time the model is loaded.

This StackOverflow post explains the problem in detail: App Storage Size Increases with CoreML usage

This is a critical issue because the cache will eventually fill up the phone’s storage:

doUnloadModel:options:qos:error:: model=_ANEModel: { modelURL=file:///var/mobile/Containers/Data/Application/22DDB13E-DABA-4195-846F-F884135F37FE/tmp/F38A9824-3944-420C-BD32-78CE598BE22D-10125-00000586EFDFD7D6.mlmodelc/ : sourceURL= (null) : key={"isegment":0,"inputs":{"0_0":{"shape":[256,256,1,3,1]}},"outputs":{"142_0":{"shape":[16,16,1,222,1]},"138_0":{"shape":[16,16,1,111,1]}}} : identifierSource=0 : cacheURLIdentifier=E0CD0F44FB0417936057FC6375770CFDCCC8C698592ED412DDC9C81E96256872_C9D6E5E73302943871DC2C610588FEBFCB1B1D730C63CA5CED15D2CD5A0AC0DA : string_id=0x00000000 : program=_ANEProgramForEvaluation: { programHandle=6077141501305 : intermediateBufferHandle=6077142786285 : queueDepth=127 } : state=3 : programHandle=6077141501305 : intermediateBufferHandle=6077142786285 : queueDepth=127 : attr={
    ANEFModelDescription =     {
        ANEFModelInput16KAlignmentArray =         (
        );
        ANEFModelOutput16KAlignmentArray =         (
        );
        ANEFModelProcedures =         (
                        {
                ANEFModelInputSymbolIndexArray =                 (
                    0
                );
                ANEFModelOutputSymbolIndexArray =                 (
                    0,
                    1
                );
                ANEFModelProcedureID = 0;
            }
        );
        kANEFModelInputSymbolsArrayKey =         (
            "0_0"
        );
        kANEFModelOutputSymbolsArrayKey =         (
            "138_0@output",
            "142_0@output"
        );
        kANEFModelProcedureNameToIDMapKey =         {
            net = 0;
        };
    };
    NetworkStatusList =     (
                {
            LiveInputList =             (
                                {
                    BatchStride = 393216;
                    Batches = 1;
                    Channels = 3;
                    Depth = 1;
                    DepthStride = 393216;
                    Height = 256;
                    Interleave = 1;
                    Name = "0_0";
                    PlaneCount = 3;
                    PlaneStride = 131072;
                    RowStride = 512;
                    Symbol = "0_0";
                    Type = Float16;
                    Width = 256;
                }
            );
            LiveOutputList =             (
                                {
                    BatchStride = 113664;
                    Batches = 1;
                    Channels = 111;
                    Depth = 1;
                    DepthStride = 113664;
                    Height = 16;
                    Interleave = 1;
                    Name = "138_0@output";
                    PlaneCount = 111;
                    PlaneStride = 1024;
                    RowStride = 64;
                    Symbol = "138_0@output";
                    Type = Float16;
                    Width = 16;
                },
                                {
                    BatchStride = 227328;
                    Batches = 1;
                    Channels = 222;
                    Depth = 1;
                    DepthStride = 227328;
                    Height = 16;
                    Interleave = 1;
                    Name = "142_0@output";
                    PlaneCount = 222;
                    PlaneStride = 1024;
                    RowStride = 64;
                    Symbol = "142_0@output";
                    Type = Float16;
                    Width = 16;
                }
            );
            Name = net;
        }
    );
} : perfStatsMask=0}  was not loaded by the client.

CoreML keeps an artifacts of the model specialization in a purgeable cache. (See https://developer.apple.com/videos/play/wwdc2023/10049/?time=540)

It reuses the cached model when it loads a same model. So, unless your application load a new different model, the cache size shouldn't increase.

Also, the operating system purges the cached object when the storage space is running short.

Is this consistent with your observation?

We’re reloading the same model, but Cormel is creating a new cache each time the model is loaded. We noticed this issue because one of our customers could no longer use the app due to their phone’s storage filling up. This is a high-priority issue for us, so thank you for your quick response. We hope you can help us find a solution—even a way to manually clear the cache would be very helpful.

I watched the WWDC 2023 video and understand that Core ML is supposed to create a cache and reuse it, but that’s not happening in this case. The doUnloadModel:options:qos:error warning seems to be a clear indicator of this issue.

This post explains the issue in more detail: App Storage Size Increases with CoreML or Metal Usage on iOS

That -doUnloadModel: message is nothing to do with the model cache management.

Let me start with on-device model compilation API's common gotcha (MLModel.compileModel(at:), etc). If you use this API, you should move the compiled model (.mlmodelc bundle) to non-purgeable location such as .applicationSupportDirectory and then delete it after use. Even if your app doesn't use on-device compilation APIs, some third party frameworks might. So, please double check.

Now, let's talk about the model specialization and its cache management.

We generate a different specialization for each model, compute unit selection, and other MLModelConfiguration properties. So, for a given .mlmodelc bundle, if you load it with .cpuOnly and then with .cpuAndGPU, it creates two specializations in the cache. It seems to me this is what is happening in the repro video on the StackOverflow (?)

Also, we compare various file metadata to decide whether the model is same as the cached one instead of actually comparing the model contents. It means, in the app development phase in Xcode, CoreML would take uncached load path whenever Xcode installs the app to the device. Similarly, if you use on-device model compilation API, the resultant model will be considered as a different model on every compilation.

For the further analysis, we need a repro that directly use CoreML API. (We cannot debug on Tensorflow CoreML integration.)

I don't know enough to fix tflite's coreML implementation, but thanks for all the suggestions. my guess is that there is something that we need to update in coreml_executer.mm

CoreML - doUnloadModel:options:qos:error
 
 
Q