Issue:

Archiving partial backups of VMware clients on CFAs with firmware version 5.0 or later takes far more space on the archive drive than reported on Jobs > History.

Cause:

From IDR v5.0, backup of VMs obtained through integration with VMware allowed for file level restores from within the VM. To facilitate this the system treats the partial backups of VMware clients differently than it did in the past.

In IDR v4.0, we introduced the VMware integration. In IDR v4.0+, when we back up a VM we take a snapshot of the current system state and save that. This can then be used to restore the entire virtual machine to the state it was in at the time the snapshot was taken. You can choose to either restore over the original machine, or restore it to a new virtual machine. Partial backups are taken by working with the change block tracking in VMware. If this feature is supported in your environment it allows us to take get only the portions of the disk that have changed since the last backup was taken. In order to restore a virtual machine from a partial backup you will need the entire job chain on the system, so you would need the most recent full backup, the most recent differential backup, and any incremental backups between that differential and the incremental you are trying to restore. When you archive jobs from VM clients in this version it copies the data that was backed up to the archive disk, so if you want to restore a partial backup from archive drive you could potentially need to import the necessary jobs from several archive drives back to the raid in order to have the full backup chain to be able to restore.

In IDR v5.0, we introduced the file level restores for VM client backups. In order to allow this we started storing all partial VM backups as synthetic fulls. This takes the change block information gathered at the time of the backup and combines that with the information from previous backups to be stored as if it were a full snapshot of the machine at the time the backup was taken. Being stored like this allows us to be able to open the backup and restore individual files from within the disks backed up. Once a backup is stored as a synthetic full the CFA will treat it like a full backup from then on, so restoring the entire VM becomes easier since every backup is a full you will only need the backup from the day you are restoring from rather than an entire chain. One of the drawbacks of VM jobs being stored and treated as fulls is that when archiving (or replicating) they will be copied over as the full job rather than just the smaller changed block information. This means that it has to copy over the entire size of the VM for every job it archives, so VM jobs can fill up archive drives quite quickly. With the VM jobs being archived like this recovering VMs are much easier since each job copied to the archive is a full so you will only need to import the job from the date you wish to restore to in order to restore the VM. VM jobs archived in this matter also retain the file level restore ability once imported back onto the raid.

In the 5.1 release we added the dehydrated archiving feature. Enabling this feature will reduce the amount of data copied to the archive drives, but has some trade offs. To make this feature work the system will store the partial backup jobs on the CFA in a hybrid of how it was stored in both the 4.0 and 5.0. When a partial backup is taken of a VM client in 5.1 and later it is still converted to the synthetic full like in 5.0, but the change block information is also stored in the job like in 4.0. This causes the jobs stored on the raid to take up slightly more space than they would have if stored strictly in the 5.0 standard. The system stores the backups in this way regardless of if the dehydrated archiving feature is enabled or not. Enabling the feature causes the system to copy only the changed block data from the stored job to the archive drive. This will cause it to treat VM jobs copied to the archive drive to act like jobs archived in 4.0. You will lose the ability to get file level restores from jobs on the archive drive, and to recover a partial backup you will have to import the entire job chain, again possibly spanning several archive drives.