In November we had a situation where our production environment had records unintentionally deleted. This required us to restore a backup copy of production into a new environment and retrieve the missing records from it.
Why didn’t we restore the environment backup directly over the production environment? There are many reasons which I’ve documented in the Forward Forever team blog. In short, if you’ve got any Power Apps canvas apps or Power Automate cloud flows in your environment, things can get seriously messed up if you restore the backup into the same environment. My recommendation is to avoid doing this in production if you have any workarounds at your disposal.
After we had manually copied & imported the data back, we left that restore environment in place for a while. In this case, “a while” actually meant 6 months. We were in no rush to free up the capacity, so I decided to wait and see if there were any further lessons to be learned from this incident.
What happens to the storage space of a restored environment that no one is using? You might expect it to remain roughly in the same size as the original backup. In our case, the restore environment grew to be over 2x the size of the original environment. Below is an illustration of the restore vs. production environment storage usage from Power Platform Admin Center reports:
Our production today is at around 7 GB total Dataverse storage consumed, whereas the production restore environment had ballooned to 17 GB. What was consuming all that space? The AsyncOperation table:
This is where all the Dataverse system jobs are stored. These jobs will keep running, even if no live users (nor outside integrations) touch the environment.
Looking at the number of rows in that table (via XrmToolBox plugin Fast Record Counter), I saw that while our production environment had 8.4k rows, the restore environment had 51k rows in that table.
Why are there more jobs in the dormant environment? This is because normally the completed system jobs are deleted by another scheduled job, known as bulk delete jobs. Only in this restore environment the jobs just kept piling up. I checked that the bulk delete jobs weren’t reporting any errors. However, the actual system jobs offered the explanation to the storage space growth:
Switching to the suspended system jobs view revealed that there were 3.5k system events stuck. New batches seemed to be generated on a daily basis. With titles like “Microsoft.Dynamics.CDD.AuthorizationCorePlugins.RoleAutoExpanderPlugin”, it wasn’t immediately obvious what these jobs were related with.
Upon inspecting the system jobs records, the column “message name” revealed that these are related to solution imports and updates. Yes, just because you stop using a Dataverse environment, that doesn’t mean Microsoft would stop from servicing it with the latest solution versions and new features.
Why did the jobs get suspended then? The answer is in what happens after restoring tje environment from a backup. It gets put into administration mode by default. The intention here is quite sensible, since you wouldn’t want any integrations from the newly restored environment to be talking with the outside world. This could cause issues when you’d have multiple Dataverse environments connecting to the same target systems, potentially causing duplicate data and messages to be created.
The challenge here is that in today’s Dataverse / Dynamics 365 environments there are first-party integrations that also rely on features that the admin mode by default disables. These will keep running as system jobs inside the environment, yet they can’t complete their tasks and are therefore put in the queue as suspended jobs.
In a small CRM style environment like we have, this caused 10 GB worth of additional data to get accumulated into the Dataverse tables within 6 months. While system jobs are now stored in the cheaper file capacity rather than the expensive database capacity, it’s still quite a lot of unwanted storage consumption from built-in features.
Obviously the administration mode is not designed to be a permanent state for any Power Apps or Dynamics 365 solution’s hosting environment. This does highlight the fact that it’s not possible to simply “freeze” a Dataverse environment and keep a snapshot of your data and configuration for a longer duration in the MS cloud. All live environments will get updates to system solutions sooner rather than later, thus altering the state of the database. While the business data in the Dataverse tables will be preserved as-is, the metadata and its surrounding maintenance processes will keep on living their lives.