How often do linked documents change after send? Peter Kozak of Cloudficient may have an answer – with an important caveat.
In this post by Peter titled (wait for it!) How Often Do Linked Documents Change After Send?, available here), he discusses Craig Ball’s intuition that fewer than 10 to 20% of linked attachments are meaningfully modified after being shared that he discussed in his post titled A Dog and Its Tail: Don’t Let Version Uncertainty Cloud Linked Attachment Production (available here and covered by us here). In that same post, Craig called for meaningful stats that accurately reflected the percentage of files modified after they were sent.
As part of my coverage of the update to the Reconstruction-Grade eDiscovery standard, I recalled that Tom O’Connor and Rachi Messing had interviewed Stefanie Bier, who was (and still is) Principal PM Manager, Data Compliance and Privacy Products at Microsoft back in November 2024. At the time, Stefanie stated that Microsoft’s legal department has found that in over 35% of recent cases, the data in cloud attachments changed between the time it was shared and when it was later accessed.
Peter’s latest post details Cloudficient’s “first-run measurement…[on] one tenant, on a platform-versioning measure, reported on our own methodology and footing”, as follows:
- The preservation system accumulated send-time metadata for several million email-linked document events.
- From that population, 10,000 links were randomly selected for comparison against Microsoft Graph’s current `DriveItem` state – `lastModifiedDateTime`, the versions collection, and lifecycle indicators.
- 9,795 of the 10,000 links resolved successfully against Microsoft Graph.
- For 6,098 of the resolved rows – non -`.aspx` file-link rows where both the Graph lookup succeeded, and the preservation system had captured a send-time version identifier – enabling the team to could compute the headline comparison.
What did they find? Within that 6,098-link population, 75.3% of files no longer carried the same Microsoft version identifier at query time that they carried at send time. By Microsoft’s own version accounting, the file the recipient would retrieve today often is not in the same version state it was in when the link arrived in their inbox.
🤯
It’s even more common when you restrict to the Microsoft Office document types that dominate modern discovery – Word (83.1%), PowerPoint (83.3%), and Excel (78.4%). The Office subtotal over 5,459 files is 81.9% that changed.
🤯 🤯
Do linked documents change often after send? It appears that they do. However, Peter follows up with this important caveat:
“Important measurement note. In this first run, ‘modified’ means the Microsoft version identifier at query time differed from the version identifier preserved at send time. That is a platform-versioning measure. It is not yet a hash-level or semantic comparison of the file contents, and it may therefore include version increments associated with autosave, co-authoring, or other platform activity that would not necessarily amount to a reader-visible substantive edit.”
In other words: There are a lot of reasons a version of a file may have changed that have nothing to do with a notable change to the content.
I personally see that first-hand. I have a habit of regularly opening existing files and doing a “Save As” with a new name as a starting point when I write many of my blog posts. That’s especially true if I am writing a follow-up post on a topic I’ve written about before and want to reference some of the previous content in the new post (like I did in this post).
One thing I’ve noticed is that unless I immediately do the “Save As”, the last update date of the original file gets updated to the current date and time – simply by opening the file. And when you check the version history in Word, the file shows a version history entry to that effect.
Peter goes on to provide other statistics from the test – for example, over half (53.6%) of the files were updated within one month, 70.3% within 3 months. How many version changes did they have? The median was 17 over the population. 11.8% of the files had 100+ version changes!
🤯 🤯 🤯
So, what are they doing next? Peter says: “We expect to publish the full measurement specification alongside subsequent measurements – the hash-level refinement, additional tenants, and retention aging – so that others can measure too. We’d rather the number be tested than trusted.” He also notes that the standards consequence of this measurement is in a companion post published alongside this one on the RGR standard site.
Do linked documents change often after send? In theory, yes. But whether the “change” represents a material content-related change remains to be seen.
So, what do you think? How often do linked documents change after send? If you have some information, please share it! And please share any comments you might have or if you’d like to know more about a particular topic.
Image created using DALL-E 3, using the term “robot dog looking at its tail quizzically” (with a staircase added via ChatGPT last time, and the background brightened via ChatGPT this time). It’s “versioning”, get it? 🤣
Disclosure: Cloudficient is an Educational Partner and sponsor of eDiscovery Today
Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by my employer, my partners or my clients. eDiscovery Today is made available solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Today should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.
Discover more from eDiscovery Today by Doug Austin
Subscribe to get the latest posts sent to your email.



