/System/Library/Frameworks dylibs are ... not quite there

While playing with this app I found something odd:

let dylib1 = dlopen("/System/Library/Frameworks/CreateMLComponents.framework/CreateMLComponents", O_RDONLY)!
let s1 = dlsym(dylib1, "CreateMLComponentsVersionString")!
var info1 = Dl_info()
let success1 = dladdr(s1, &info1)
precondition(success1 != 0)
print(String(cString: info1.dli_sname!)) // CreateMLComponentsVersionString
let path1 = String(cString: info1.dli_fname!)
print(path1) // /System/Library/Frameworks/CreateMLComponents.framework/Versions/A/CreateMLComponents
let exists1 = FileManager.default.fileExists(atPath: path1)
print(exists1) // true

let dylib2 = dlopen("/System/Library/Frameworks/Foundation.framework/Foundation", O_RDONLY)!
let s2 = dlsym(dylib2, "NSAllocateMemoryPages")! //
var info2 = Dl_info()
let success2 = dladdr(s2, &info2)
precondition(success2 != 0)
print(String(cString: info2.dli_sname!)) // NSAllocateMemoryPages
let path2 = String(cString: info2.dli_fname!)
print(path2) // /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation
let exists2 = FileManager.default.fileExists(atPath: path2)
print(exists2) // false

The app runs fine and prints true for exists1 and false for exists2. That means that while both dlsym calls succeed and both dladdr calls return paths (within CreateMLComponents.framework and Foundation.framework correspondingly) the first file exists while the second file doesn't exist.


This raises quite a few questions:

  1. Why some of the dylib files (in fact – most dylibs inside /System/Library/Frameworks hirerarchy) don't exist at the expected locations?

  2. Why do we have symbolic link files (like Foundation.framework/Foundation) that point to those non-existent locations? What is the purpose of those symbols links?

  3. Where are those missing dylib files in fact? They must be somewhere, no?! I guess to figure out the answer I could search the whole disk raw bytes for a particular byte pattern to know the answer but hope there's an easier way to know the truth!

  4. Why do we have some exceptional cases like "CreateMLComponents.framework" and a couple of others that don't follow the rules established by the rest?

Thanks!

Answered by DTS Engineer in 798009022
the first file exists while the second file doesn't exist.

Yes. That’s expected, and it’s fallout from the dynamic linker shared cache. I explain that in this post. And, it’ll probably be worth your while reading An Apple Library Primer for general backstory.

Taking a step back, why are you trying to dlopen system libraries? There are some cases where that makes sense, but in most situations it’s better to import those libraries rather than dynamically load them.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

the first file exists while the second file doesn't exist.

Yes. That’s expected, and it’s fallout from the dynamic linker shared cache. I explain that in this post. And, it’ll probably be worth your while reading An Apple Library Primer for general backstory.

Taking a step back, why are you trying to dlopen system libraries? There are some cases where that makes sense, but in most situations it’s better to import those libraries rather than dynamically load them.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Where are those missing dylib files in fact? They must be somewhere, no?! I guess to figure out the answer I could search the whole disk raw bytes for a particular byte pattern to know the answer but hope there's an easier way to know the truth!

Adding one point that may clarify this, scatter'ed through Quinn's post you'll find references to "dynamic linker shared cache". The problem with that term is that the evolution of the system has made the term "cache" quite misleading. It USED to be a cache, in that the system took all of the dynamic libraries it had and created a cache file it could then use to speed things up.

That's not what it is today.

It isn't a "cache" of other libraries, it IS the system's library "set". The system assumes that file will exist and will contain what it should. If it doesn't, then the system is simply broken and (probably) won't boot.

Note that this was driven by security and boot architecture changes as much as performance. The library cache needs to be in the read only system partition so that it's available at boot and as an additional security check. It could be built locally for every system update, but that's just creates extra work, wastes time, creates another failure point, and makes it harder to verify and sign everything. Much easier to just download it as part of the update... at which point there isn't any reason to ship the "original" libraries, as they would just wastes space.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Thanks, this sheds some light on (1) although doesn't answer (2), (3) and (4) in my original post.

I'd be glad to know more about that library cache / system partition thing. Could I see that partition? Are there the whole "***.framework" folders themselves in there or just the dylib files that were stripped from the "original" location? As this is not traditional unix anymore why do we still have "dli_fname" with something that resembles a valid path, instead of, say, an empty string? Given exceptions in (4) normal library locations are still supported so it does look like an overly complicated setup: for starters you need to support both ways instead of just one.

Taking a step back, why are you trying to dlopen system libraries?

That's just a test setup that reproduces the difference in behaviour in this particular test. Basically I need the library file name, file size and its modification date, ideally the checksum but can do with the first three, so dlsym (with one of the magic constants for the file specification) + dladdr + stat or similar API should be enough, and dlopen is not needed.

Could I see that partition?

It is not a partition, but a file. Imagine if you took a whole bunch of dynamic libraries and statically linked them together. That’s kinda the way to think about the dynamic linker shared cache.

Consider this:

% cat test.c 
#include <stdio.h>

extern int main(int argc, char ** argv) {
    printf("Hello Cruel World!\n");
    return 0;
}
% clang -o test test.c
% lldb test
(lldb) target create "test"
Current executable set to '/Users/quinn/Test/test' (arm64).
(lldb) br set -n printf
Breakpoint 1: where = libsystem_c.dylib`printf, address = 0x00000001803092b4
(lldb) r
Process 48202 launched: '/Users/quinn/Test/test' (arm64)
…
(lldb) shell lsof -p 48202
COMMAND   PID  USER   FD   TYPE DEVICE   SIZE/OFF     NODE NAME
…
test    48202 quinn  txt    REG   1,14 2449473536   187905 …/dyld_shared_cache_arm64e
…

As you can see, my test tool has memory mapped a giant file call dyld_shared_cache_arm64e.

WARNING The location and format of the dynamic linker shared cache is not documented. These things change regularly, typically with every major release of macOS. Don’t build products that rely on this stuff.

Are there the whole [.framework] folders themselves in there or just the dylib files that were stripped from the "original" location?

The latter. The rest of the framework infrastructure, like its resources, continue to live in the standard place.

As this is not traditional unix anymore why do we still have dli_fname with something that resembles a valid path, instead of, say, an empty string?

Because the latter would cause compatibility problems. It’s not uncommon for programs to call dladdr, look up dli_fname, and start poking around on the file system for, say, the bundle ID of a framework.

it does look like an overly complicated setup

Welcome the world of building an operation system used by hundreds of millions of users and a commensurate number of developers.

I need the library file name, file size and its modification date

What are planning to do with that info?

Also, if you’re starting with dlsym, that means you’re doing this in process. If so, dli_fbase is your friend. From that you can find the LC_UUID value, which is by far the best way to uniquely identify a Mach-O image.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

/System/Library/Frameworks dylibs are ... not quite there
 
 
Q