Any specific event that can be monitored for tracking file upload via fileproviderd/extension?

Hi Team, I am trying to explore ESF events specifically generated by cloudsync extensions built on File Provider framework.

Brief: I have high-level understanding of how various cloud vendors have provided their extensions to sync data from cloud/remote storage to local filesystem (and vice-versa). e.g.iCloudDriveFileProvider (icloud), DFSFileProviderExtension (google drive).

There are 2 ESF AUTH events for file provider I can see namely: ES_EVENT_TYPE_AUTH_FILE_PROVIDER_MATERIALIZE , ES_EVENT_TYPE_AUTH_FILE_PROVIDER_UPDATE. and respectively their NOTIFY events.

Observation: Observed that these events are generally triggered by fileproviderd process during download scenario i.e. syncing files from cloud/remote storage to local file system. i.e. 'materialize' for new file creation and 'update' for updating existing file.

Question/Problem:

  • Is there a correct way to find which cloud provider has triggered this download event? i.e. weather it is iCloudDriveFileProvider or DFSFileProviderExtension (there is this instigator field in Materialize event struct, but could not find similar for Update event.

  • Are there similar ESF events for upload scenario? (I have fair understanding of how file-to-upload is copied to temp location and then uploaded by respective extensions to remote storage, but then they work with original files clone created in their temp location, so the AUTH events generated by this extension will wont reveal the original file name even if I am able to get the Fileprovider name)

To Summarize: Basically I am looking for ESF event that will be triggered during upload scenario that can also let me know original file name as well the cloudprovider extension process name. As of now 'fileproviderd' process name is obtained from filesystem ESF events like AUTH_OPEN etc.

Answered by DTS Engineer in 805191022

Is there a correct way to find which cloud provider has triggered this download event?

You could probably infer it from the path itself, but no, there isn't any way to do it form the ES event itself. The issue here is that both of these events occur before anything actually "happens"*, which means there isn't any guarantee that the target extension is even running.

Related to that point:

i.e. weather it is iCloudDriveFileProvider or DFSFileProviderExtension (there is this instigator field in Materialize event struct, but could not find similar for Update event.

I haven't directly tested it, but I'd assume that "instigator" would be the underlying process that actual asked for the file to materialize. In concrete terms, if you asked TextEdit to open a data less text file, then (I believe) TextEdit would be the "instigator".

I believe this is also why "update" doesn't have an instigator field. The update operation would have been tied to the internal operation of the file provider itself, not directly triggered by a different app.

Are there similar ESF events for upload scenario?

No, because there isn't any single or clear "upload scenario". The system doesn't have any real visibility into the file provider extensions architecture, so it has no idea what actually happens to the file the file provider "owns". At a minimum, network issues ("being offline") mean that they'll often be cases where there are significant delays between when the extension receives a change and when it's actually uploaded. However, that's the the simplest case. Other possibilities include:

  • The file may intentionally never be uploaded by "policy". Think of something like a version control system where specific files/file types are marked as "local only" and never actually sent to the remote server.

  • The file may intentionally never be uploaded because of other changes. If provider determines that a file has been deleted at the server, then there's no reason to upload it.

  • There may not be a "network" at all. The standard "cloud storage" product is obviously the common case, but I could easily see a file provider extension to be used to bridge into non-standard use cases that don't involve any kind of network.

Related to that point, I would be very cautious about this:

Brief: I have high-level understanding of how various cloud vendors have provided their extensions to sync data from cloud/remote storage to local filesystem (and vice-versa). e.g.iCloudDriveFileProvider (icloud), DFSFileProviderExtension (google drive).

It's very easy to test with the "common" cases, assume that is in fact "How Things Really Work™", then build a complicated architecture that relies on those assumption. That works great until the original product(s) change (because nothing actually required them to work that way) or you encounter a new product that simply doesn't work that way at all.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Accepted Answer

Is there a correct way to find which cloud provider has triggered this download event?

You could probably infer it from the path itself, but no, there isn't any way to do it form the ES event itself. The issue here is that both of these events occur before anything actually "happens"*, which means there isn't any guarantee that the target extension is even running.

Related to that point:

i.e. weather it is iCloudDriveFileProvider or DFSFileProviderExtension (there is this instigator field in Materialize event struct, but could not find similar for Update event.

I haven't directly tested it, but I'd assume that "instigator" would be the underlying process that actual asked for the file to materialize. In concrete terms, if you asked TextEdit to open a data less text file, then (I believe) TextEdit would be the "instigator".

I believe this is also why "update" doesn't have an instigator field. The update operation would have been tied to the internal operation of the file provider itself, not directly triggered by a different app.

Are there similar ESF events for upload scenario?

No, because there isn't any single or clear "upload scenario". The system doesn't have any real visibility into the file provider extensions architecture, so it has no idea what actually happens to the file the file provider "owns". At a minimum, network issues ("being offline") mean that they'll often be cases where there are significant delays between when the extension receives a change and when it's actually uploaded. However, that's the the simplest case. Other possibilities include:

  • The file may intentionally never be uploaded by "policy". Think of something like a version control system where specific files/file types are marked as "local only" and never actually sent to the remote server.

  • The file may intentionally never be uploaded because of other changes. If provider determines that a file has been deleted at the server, then there's no reason to upload it.

  • There may not be a "network" at all. The standard "cloud storage" product is obviously the common case, but I could easily see a file provider extension to be used to bridge into non-standard use cases that don't involve any kind of network.

Related to that point, I would be very cautious about this:

Brief: I have high-level understanding of how various cloud vendors have provided their extensions to sync data from cloud/remote storage to local filesystem (and vice-versa). e.g.iCloudDriveFileProvider (icloud), DFSFileProviderExtension (google drive).

It's very easy to test with the "common" cases, assume that is in fact "How Things Really Work™", then build a complicated architecture that relies on those assumption. That works great until the original product(s) change (because nothing actually required them to work that way) or you encounter a new product that simply doesn't work that way at all.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Thank you Kevin for providing response, appreciate it.

It would have been a "Nice to have" point, wherein fileproviderd could reveal it is working for which cloud extension while serving the request. That could help in use case where we are interested to monitor specific extension only. Though we can deduce this information from the file path but as rightly mentioned by you we cannot totally rely on such assumptions.

It would have been a "Nice to have" point, wherein fileproviderd could reveal it is working for which cloud extension while serving the request. That could help in use case where we are interested to monitor specific extension only. Though we can deduce this information from the file path but as rightly mentioned by you we cannot totally rely on such assumptions.

As a quick followup here, I think the real reason (at least for materialize) it doesn't give you this information is that fileproviderd doesn't ACTUALLY "know" the answer yet. More specifically, the architecture here relies on "dataless files", as described in TN3150.

When a process opens a dataless file, the sequence is basically:

  1. The kernel notices the file is dataless, blocks the calling thread in "open" and tells fileproviderd it needs the file.
  2. fileproviderd figures out who "owns" that particular file and notifies that client.
  3. That client downloads the content and tells fileproviderd about the new content.
  4. fileproviderd "sets everything up" and then tells the kernel about it.
  5. The kernel completes the open using the new content.

The ES check basically happens between 1 & 2, basically "before" fileproviderd does any real work with the file.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Thank you Kevin for your response.

Any specific event that can be monitored for tracking file upload via fileproviderd/extension?
 
 
Q