Hello,
I have developed a custom filesystem in golang, that relies on macFUSE.
High-level apps on osx (TextEdit, Numbers, Preview) rely on syscall.renamex_np with the flag RENAME_SWAP in order to save edits.
In golang, the sys call renamex_np and renameat2 are not supported, thus I had to implement the logic for it it.
The discussion opened on the google group for macFUSE can be followed here: https://groups.google.com/g/osxfuse-group/c/Kh0qVRGIVv4
On my mounted filesystem, edits work and performing system calls work. However after I perform a series of edits in TextEdit, and completely exit TextEdit. When I call open (1) on the file I get the following error:
The application cannot be opened for an unexpected reason, error=Error Domain=NSOSStatusErrorDomain Code=-43 "fnfErr: File not found" UserInfo={_LSLine=4129, _LSFunction=_LSOpenStuffCallLocal}
From the logs of my app, there is no open (2) called on the file.
I have tried to (trace) dtruss the open call for Numbers/TextEdit, but when I perform the above scenario, my Mac system freezes and the piped output from dtruss is 0 bytes after rebooting my system.
How can I debug my issue? Where can I find more documentation of the order of system calls for open (1)? I couldn't find the source code for renamex_np thus my implementation relied on the linux kernel implementation of renameat2, does renamex_np do something different?
I note that, if I open TextEdit for example, and then open my file, there is no problem. Also calling cat
for example on the terminal it displays the content correctly. The problem seems to be from open (1). Furthermore, if I perform a rename of the file, open (1) succeeds in opening the file, until I perform at least another edit from a high-level app (that calls rename with the swap flag). Also if I unmount my filesystem and mount it again, open (1) behaves correctly.
How can I understand what open (1) is doing under the hood? For the high-level apps I could trace the system calls and figure out why they didn't work, but now I reached a point (scenario) where I can't trace the system calls for open (1) due to my whole system freezing.
Any input is appreciated.
Thank you for pointing it out, I would have been stuck on this for a while. I think that once I figure out why the Inode doesn't get removed (even if there are no open File Descriptors in it, but children still present), the problem should be solved.
You're very welcome, glad to hear you were able to get to the bottom of this.
One quick comment on this:
your file system is not required to support atomic swaps and the system will take care of it if it doesn't
When the rename with the swap flag is called, returning syscall.EINVAL, syscall.ENOTSUP, sys call.ENOSYS has the effect that edits using TextEdit/Numbers/Preview cannot be saved. I need to support rename with swap.
The issue here is, IMHO, caused by a bug in FUSE. I believe what's going on here is that they don't provide any way for you to set "VOL_CAP_INT_EXCHANGEDATA" and/or "VOL_CAP_INT_RENAME_SWAP" to "false". This is what they SHOULD be doing in your case (and probably MANY other file systems). That's also why the failure above happens- your file system said "I support rename swap" but is then failing every call to "RENAME_SWAP". We never try anything else because the file system told us that it would work.
What SHOULD have happened here is that FUSE should have returned "false" for VOL_CAP_INT_EXCHANGEDATA/VOL_CAP_INT_RENAME_SWAP, at which point the system would never have tried RENAME_SWAP.
I would strongly encourage you to follow up with the FUSE team on this. Many file system don't support atomic exchanges and claiming support without a valid implementation risks data loss.
__
Kevin Elliott
DTS Engineer, CoreOS/Hardware